Texas AI Summit Sessions

Don't plan on leaving early. Martin Fowler (Refactoring) will be speaking at 7PM!

We will be publishing last round of sessions in the next few days.

Opening Keynote
Data Driven Natural Language Generation: Linking Humans to the Machine with the Power of Narrative

Kristian Hammond - Narrative Science

People have a difficulty with data. At best, the tools of spreadsheets, charts and graphs provide us a view into data without giving us a window onto the insights that are hidden within them.
On the other side of the coin, there is language. Language provides us with the ability to explain the past, describe the present and project the future. The power it gives us to communicate what we know to each other is one of the more striking ways we which we are different from animals and machines.
As the need to understand the data that now surrounds us becomes more pressing, it is clear that we need to build systems that can map the meaning of the data that the machine is collecting onto language that communicates it to humans, regardless of data skills.
In this talk, I will outline what we are with regard to data analytics and the technology of automatic narrative generation. I will look at how language generation plays the crucial role of bridging the gap between the Big Data world of numbers and symbols and our need for understandable insights. I will dive into use cases from business, education and everyday life to show how the power of automatically generated narratives can provide us all with the insights that are still trapped in the wealth of data we now control.

Character-Level Convolutional Neural Networks for Semantic Classification

Paul Azunre / Numa Dhamani - New Knowledge

A character-level convolutional neural network (CNN) motivated by applications in "automated machine learning'' (AutoML) is proposed to semantically classify columns in tabular data. Simulated data containing a set of base classes is first used to learn an initial set of weights. Hand-labeled data from the CKAN repository is then used in a transfer-learning paradigm to adapt the initial weights to a more sophisticated representation of the problem (e.g., including more classes). In doing so, realistic data imperfections are learned and the set of classes handled can be expanded from the base set with reduced labeled data and computing power requirements. Results show the effectiveness and flexibility of this approach in three diverse domains: semantic classification of tabular data, age prediction from social media posts, and email spam classification. In addition to providing further evidence of the effectiveness of transfer learning in natural language processing (NLP), our experiments suggest that analyzing the semantic structure of language at the character level without additional metadata---i.e., network structure, headers, etc.---can produce competitive accuracy for type classification, spam classification, and social media age prediction. We present SIMON, an open-source toolkit for Semantic Inference for the Modeling of ONtologies that implements this approach in a user-friendly and scalable/parallelizable fashion.
Intended audience: Natural Language Processing, Neural Networks, Automated Machine Learning
Skills required: Beginner to intermediate knowledge of classification and neural networks

Deep Generative Models and Inverse Problems.

Alexandros Dimakis - University of Texas, Austin

In this talk we will explain what deep generative models are and how they can be used to solve linear inverse problems. Linear inverse problems involve the reconstruction of an unknown vector (e.g. a tomography image) from an underdetermined system of noisy linear measurements. Most results in the literature require that the reconstructed signal has some known structure, e.g. it is sparse in some basis (usually Fourier or Wavelet). In this work we show how to remove such prior assumptions and rely instead on deep generative models (e.g. Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs)). We show how the problems of image inpainting (completing missing pixels) and super-resolution are special cases of our general framework. We generalize theoretical results on compressive sensing for deep generative models and discuss several open problems.

Interpretability of ML Systems: Can Physical Models Learn from Deep Learning?

Graham Ganssle - Expero Inc.

The development of deterministic physical models is founded on (or verified by) experimentation. When the results of these experiments don’t match up to our theories it’s time to change the theory. By interpreting machine learning models of these same physical systems, we can improve our deterministic models, thus increasing our understanding of physics. We claim this is true across all fields of study in which practitioners are building machine learning models. The hard part (and a current field of intense research) is the interpretation of the latent spaces of these machine learning models to extract information used to correct our models.
Join us as we demonstrate the extraction of latent space information and apply it to a set of physical models to increase the accuracy of the models. We’ll show three open source projects aimed at model interpretability, and demonstrate their use on our physics neural network. We’ll discuss several custom approaches, like factor analysis, to model interpretability and show why they’re powerful. We’ll wrap up with a discussion of where these techniques are applicable and why model interpretability is the future career of most data scientists.
Intended audience: introductory data scientists, managers of data science and engineering teams, product owners interested in applying machine learning.

Using Deep Learning to Measure Objects in 3DImages

Graham Ganssle - Expero Inc.

This talk is an end-to-end discussion of the ideation, development, and deployment of a deep learning system crafted to extract dimensionality and volumetric information from 3D images. The client wanted the ability to extract the dimensionality of packages in real time as they stream past a sensor on a conveyor belt. Our system uses a high speed 3D camera to capture an image, process the data, infer the dimensionality of the package, and deliver a result in real time.
In this presentation we’ll discuss two approaches taken, one, a deterministic approach used as a baseline, and two, the deep learning system used for production. We’ll tear apart the details and the surprising results of our deep learning methodology using texture images, depth-wise point clouds, and a combination of both. We’ll also discuss in detail the training-time and inference-time cloud architecture used so attendees can gauge the simplicity for deploying their own models.

Fighting human trafficking with AI

Mayank Kejriwal - USC Information Sciences Institute

The growth of the web combined with the ease of sharing information it makes possible has led to increased illicit activity both on the Open and Dark Web, an egregious example being human trafficking. The DARPA MEMEX program, which funded research into domain-specific search, has collected hundreds of millions of online sex advertisements, a significant (but unknown) number of which are believed to be sex (and human) trafficking instances. At the same time, such data also provides an opportunity to study, investigate, and ultimately prosecute perpetrators of human trafficking by grouping and extracting patterns from millions of ads using automatic machine learning and natural language processing techniques.
Mayank Kejriwal discusses the development of a knowledge-centric architecture called Domain-specific Insight Graphs (DIG)—built under three years of MEMEX-funded research—that integrates cutting-edge AI techniques in a variety of fields. DIG reads and processes millions of ads from the web and places this information before investigators using a frontend interface. At the time of writing, DIG is being used by over 200 law enforcement agencies in the US for combating human trafficking and has led to actual prosecutions in both San Francisco and New York. DIG has also been extended in promising ways to combat other social problems like securities fraud and counterfeit electronics manufacturing.
Mayank offers an overview of DIG and explains how knowledge-centric architectures can help facilitate AI for social good. Along the way, he shares case studies on its successes and the key lessons learned during its development.

Addressing Training Data Bias in Machine Learning

Dr. Cheryl Martin - Alegion

Bias exhibited in the performance of machine learning models usually arises from the training data. Machine learning professionals are taught to recognize and mitigate bias in the training process, where bias is interpreted in the sense of the "bias-variance tradeoff." However, it is much less common for a machine learning curriculum to include content about how to prevent, recognize, and mitigate bias that arises from the training data. This talk will discuss three types of bias in training data, and it will describe approaches and techniques for recognizing and addressing each type.

Practical Methods for Overcoming the Machine Learning Data Bottleneck (90 minutes)

Jonathan Mugan - Deep Grammar

Machine learning is powerful, but it can be hard to reap its benefits without large amounts of labeled training data. Labeling data by hand can be time-consuming, expensive, and impractical; and sometimes you don’t even have sufficient examples to label, especially of the rare events that are most important. This session will provide practical methods to overcome this data bottleneck. You will learn how to use heuristics to label data automatically, and you will learn how to generate synthetic training examples of rare events using generative adversarial networks (GANs). You will also learn other data augmentation approaches and methods for training models when the training data is imbalanced. The session will also cover how to use machine learning when you only have one or a few examples.

Empowering the Humans in the Loop by Synthesizing Machine Learning and Optimization

Ethan Rosenthal - Rosenthal Data, LLC

Dia&Co is a plus-size women’s personal styling service powered by humans and algorithms. The business model consists of a classic human-in-the-loop process wherein multiple machine learning products are employed to inform human stylists’ decision making. These loops naturally face a tradeoff between automation and autonomy. This talk will explore the design choices and specific models that have freed both woman and machine to do what they do best.
I will start with a deep dive into a novel data product built for a new business line. This product combines classical operations research techniques with modern machine learning. Such a combination allows for personalized experiences to be realized in the physical world where business constraints such as limited inventory are paramount. During this deep dive, I will introduce the basics of mixed-integer programming and show how to build a simplified version of the data product using all open source libraries. Algorithms in isolation deliver little value, so I will walk through the gory details of turning an academic model into a robust and reliable data product. Lastly, I’ll explain how to layer machine learning onto the integer programming problem in order realize true personalization that improves over time.
After the deep dive, we will zoom out to consider the entire human-in-the-loop process. I’ll touch on the various algorithms used at each point in the loop and close with lessons learned and future plans.

Edge intelligence: Machine learning at the enterprise edge

Chris Sachs - SWIM.AI

Enterprises and public sector organizations are drowning in real-time data from equipment, assets, suppliers, employees, customers, and city and utility data sources. Hidden insights have the potential to optimize production, transform efficiency, and streamline flows of goods and services, but finding insights cost effectively remains a challenge. Complex, big-data focused, cloud-hosted ML solutions are expensive, slow, and unsuited to real-time data. It’s important to cost-effectively learn on data at the “edge” as it is produced.
Chris Sachs details an architecture for learning on time series data using edge devices, based on the distributed actor model. This approach flies in the face of the traditional wisdom of cloud-based, big-data solutions to ML problems. You’ll see that there are more than enough resources at “the edge” to cost-effectively analyze, learn from, and predict from streaming data on the fly.
The solution relies on two fundamental innovations:
A distributed actor fabric: Used in application frameworks from Erlang to Orleans, this approach models each entity in the real world as an actor or digital twin. Simon explains how the approach uses a distributed edge compute fabric that is stateful, efficient, secure, and resilient and runs on commodity edge devices and how its creators enhanced the actor model that allows digital twins to learn—on their own real-world data—to predict future performance.
A self-training, unsupervised ML approach: Crucially, ML must be cost effective and use standard edge hardware, even nontraditional systems, such as ARM CPUs. The edge fabric transforms large volumes of low-value data into low volumes of high-value insights, and each actor predicts its own future behavior affordably, saving bandwidth and avoiding unnecessary storage and cloud processing.
Edge learning delivers new insights fast, specific to the local context, enabling the infrastructure to adapt to changing conditions. Learning at the edge on “high def” data—with many parameters per entity—enables us to avoid overfitting and to gain greater fidelity. The efficient solution of an edge learning model also is maximally efficient in terms of communication, making the edge environment into a parallel machine learning network, distributed across edge nodes.

AI-based Autonomous Response: Are Humans Ready?

Chris Thomas - Darktrace

Global ransomware attacks like WannaCry already move too quickly for humans to keep up, and even more advanced attacks are on the horizon. Cyber security is quickly becoming an arms race — machines fighting machines on the battleground of corporate networks. Algorithms against algorithms.
Artificial intelligence-based cyber defense can not only detect threats as they emerge but also autonomously respond to attacks in real time. As the shortage of trained cyber analysts worsens, the future of security seems to be automatic. But are humans ready to accept the actions machines would take to neutralize threats?
Darktrace recently ran tests across enterprises of all sizes in a variety of industries and has subsequently deployed AI-based autonomous response in over one hundred organizations. In this presentation explore lessons learned and hear about several use-cases in which autonomous response technology augmented human security teams.”

In this session learn about:
- AI approaches and algorithms for detecting and responding to threats
- How human teams adopt (or resist) automated defenses
- The concepts of ‘human confirmation’ mode and ‘active defense’
- Success stories across Smart Cities, genomics organizations, and industrial control systems

A novel adoption of LSTM in customer touchpoint prediction problems

KC Tung - Microsoft

LSTM networks are widely used in solving sequence prediction problems, most notably in natural language processing (NLP) and neural machine translation (NMT). In particular, the sequence-to-sequence (seq2seq) model is the workhorse for translation, speech recognition, and text summarization challenges. If a collection of individual sequences of events are organized as a corpus, an LSTM model may be constructed to predict the target outcome (i.e., conversion) or target sequence of events (predicted touchpoint sequence that leads to conversion).
The adoption of LSTM in touchpoint prediction stems from the need to model the customer journey or the conversion funnel as a series of touchpoints. For an advertiser or marketer, taking into account the sequence of events that leads to a conversion adds tremendous value to the understanding of conversion funnel and impact of types of touchpoints and can even identify high potential leads. With LSTM, touchpoint prediction can be framed to four different types of prediction problems: sequence prediction (model predicts future sequence to be TV-TV-buy), sequence classification (model predicts target outcome to be buy), sequence generation (model predicts target sequence that contains similar characteristics as input sequence—i.e., repeat buy), and sequence-to-sequence prediction (model predicts a target sequence that contains “buy”).
KC Tung explains why LSTM provides great flexibility to model the consumer touchpoint sequence problem in a way that allows just-in-time insights about an advertising campaign’s effectiveness across all touchpoints (channels). LSTM models can be implemented at scale to identify potential marketing leads based on known touchpoint sequences during the campaign, empowering advertisers to evaluate, adjust, or reallocate resources or investments in order to maximize campaign effectiveness. Along the way, KC offers demos of LSTM models implemented in Keras and TensorFlow.

Predicting Alzheimer’s: Generating neural networks to detect the neurodegenerative disease

Ayin Vala - (DeepMD | Foundation for Precision Medicine)

Machine learning and artificial intelligence have had a noticeable impact on the transportation, information technology, and finance sectors, but these successes have not been fully realized in medicine. However, that is changing. Computing has become very powerful, and analytics algorithms have become smart enough to spot patterns in patient characteristics and treatments to provide discovery.
Ayin Vala offers an overview of a deep learning project in personalized medicine aimed at early detection of Alzheimer’s disease for patients. Ayin explores contributing factors in historical diagnosis and medication, discusses how the project uses medical image recognition, and shares a demo of the decision support tool built for clinical facilities. This nonprofit effort is led by the Foundation for Precision Medicine in partnership with Google, the Yale School of Medicine, and the Mayo Clinic. The deep learning algorithms are based on a 100K Alzheimer’s patient record selected from 4M patients.

Reading China: Predicting Policy Change with Machine Learning

Weifeng Zhong - American Enterprise Institute

For the first time in the literature, we develop a quantitative indicator of the Chinese government’s policy priorities over a long period of time, which we call the Policy Change Index (PCI) of China. The PCI is a leading indicator of policy changes that runs from 1951 to the third quarter of 2018, and it can be updated in the future. It is designed with two building blocks: the full text of the People’s Daily — the official newspaper of the Communist Party of China — as input data and a set of machine learning techniques to detect changes in how this newspaper prioritizes policy issues. Due to the unique role of the People’s Daily in China’s propaganda system, detecting changes in this newspaper allows us to predict changes in China’s policies. The construction of the PCI does not require the researcher’s understanding of the Chinese context, which suggests a wide range of applications in other settings, such as predicting changes in other (ex-)Communist regimes’ policies, measuring decentralization in central-local government relations, quantifying media bias in democratic countries, and predicting changes in lawmaker’s voting behavior and in judges’ ideological leaning.