Important Dates


Paper Submission: November 27, 2017   December 11, 2017

Notification of Acceptance: February 5, 2018

Camera Ready Paper: February 26, 2018

Conference Dates: June 25-28, 2018

speaker picture

Dr. Behrouz Far, PhD.

Department of Electrical and Computer Engineering,
University of Calgary

speaker picture

Dr. Emad Amin Mohammed, PhD.

Software Engineering Department,
Lakehead University

Accelerating Data Analytics using Hadoop, Spark, and TensorFlow

Substantial data growth, different data warehouses, and different data format are hindering organizations to provide a better understanding of their business domain due to the substantial efforts consumed in cleansing and transforming the data into a standard usable data model for processing. Big Data analytics tools and deep learning libraries are substantially used in operation research, recommendation systems, healthcare systems and personalized health outcome improvement, etc., and become increasingly operational in nature. However, this means more data marts, more preprocessing steps, and a more comprehensive reach throughout organizations. Keeping pace with this evolution requires designing of predictive analytics models that provide quantifiable and actionable insights to improve a specific business domain. To this end, organizations start by targeting an innovative answer to a business. Convolutional Neural networks have massive development during the last few years, and they play a significant role in image recognition and automated translation. TensorFlow is a new framework released by Google (almost 2 years ago) for graph-based numerical computations and development of deep learning neural networks. In this tutorial, we are going to demonstrate how to install and configure an environment for Big Data and Deep Learning. Furthermore, we are going to demonstrate how to use TensorFlow and Spark together to train and apply deep learning models to build a data science project. The data science initiative at the University of Calgary is one of the research priority pillars. During Winter 2015 the first undergrad course at the University of Calgary “Engineering Large-Scale Analytics Systems” was offered by the Department of Electrical and Computer Engineering (ECE) and it was designed and delivered by the presenters. Recently, we had the opportunity to build the Multi-Modal Data Fusion (MMDF) lab at the ECE department. The lab is equipped with Hadoop and Spark clusters on a top of commodity hardware. The cluster is utilized to study various data science projects with significant volume of data and to design proof of concept prototypes for different business domains, e.g., autonomous vehicles, healthcare analytics, software development, etc.