Pints & Posters Reception

5:15 - 6:15 | Thursday, May 23

Conference attendees will enjoy brews and snacks while learning about interesting data science projects and accomplishments. Poster presentations will run in parallel; each speaker will provide a short (5-10 minute) presentation or interactive discussion to correspond with their visual poster.

To showcase your work via a poster session, apply here.

-

Tim Stacey will be presenting a poster titled, “ML as Teammate: Using AI in Cybersecurity.”

Cybersecurity analysis and forensic investigation have traditionally required a large team of analysts to filter through the millions of logs generated on a network. Many smaller companies and financial institutions do not have the resources, critically understaffing a problem that could be crucial to protecting assets.

Using AI and serverless architecture, my team has deployed a solution to perform much of this tedious work quickly, cheaply, and in the cloud. By providing an AI driven, off premises solution, we have advanced the ability of small financial institutions to monitor their networks and the transactions that drive their business.

In this presentation, I will discuss techniques used and challenges encountered while standing up a ML based network monitoring solution for a small to mid size financial institution.

-

Bisakha Peskin will present a poster titled, “Next to Purchase Model for Life Insurance.”

MassMutual (MM) and associated agencies offer a variety of life insurance and financial products. Customer needs are known to vary based on temporal features such as age and life events. As a customer’s profile changes, other products beyond their first purchase may begin to suit their needs. This incentivizes sales and marketing to gain awareness of this changing profile to encourage them to make fresh purchases.

Our task was to predict the new product purchase likelihood of an existing customer. We generated 3.1 million customer years from the historical MM sales data from 2010 – 2015, with 0.0689 new purchases per customer. We were also engaged by a third party agency to analyze its historical book of business of 320K customer years from 2008 – 2016 with 0.085 new purchases per customer.

This model was trained to learn product purchasing patterns and characteristics from temporal historical data. The features consisted of owner level, product level and engineered features. Owner level features, such as age and median income, were added when available. Product level features were visible starting the year after the purchase. We used gradient boosting, random forest, and logistic regression classifiers to predict likelihood. For MM, the models achieved an AUC of at least 0.76. The model had a lift of four times over a random selection of buyers. For the third party agency, the average AUC was 0.82 over all products. The model had a lift of two times over an advisor’s intuition-based approach. The most predictive features were typically the past monetary spending on products and previous product ownership.

Our model can predict a customer’s propensity to purchase again, identify which variables are most indicative of this propensity and help plan outreach.

-

Michael Starlinksi will present a poster titled, “Primrose: a Data Science Framework for Rapid Production Models.”

New data science teams face major challenges in operationalizing models, building shared infrastructure, and on-boarding new members. In an effort to solve these common issues, WW (formerly Weight Watchers) created a new modeling and deployment framework, Primrose (Production In-Memory Solution). This Python framework utilizes an in-memory, DAG-running structure, which allows for a “configuration as code” approach to deploying models and recommenders. To facilitate deployment, the framework was designed to work efficiently with single-node datasets up to 100s of GBs, while minimizing boilerplate production configuration settings and data IO library code. To assist in new member on-boarding and usability, the framework leverages the same code base and methods across applications ranging from collaborative-filter based recommenders to gradient boosted tree models. Primrose has facilitated the rapid growth of the WW DS team, from 1 to 7 members within a year, and project velocities, allowing us to deploy 6 different, scalable production models and recommenders across disparate areas of the business.

-

Hwa Jong Kim will present a poster titled, “Developing Data Science Processes with Non-technical Practitioners in the Real World.”

Big data and AI is one of the hottest topics in industries in Korea. However, most practitioners struggle to apply these technologies in the real world, because learning data science without a strong technical background in computer science and mathematics is difficult. Based on my experience consulting for data practitioners in diverse fields, I will share useful insights on how to develop effective data science processes.

-

Ande Stelk will present a poster titled, “Hitchhiker’s Guide to SAS in Domino.”

Ande will offer insights for existing SAS customers moving to SAS in containers and how Domino can benefit them - based on extended use case at a large retailer. Highlight differences in SAS v9.4 and Viya in containers from more traditional deployments. Targeting end users and administrators.



Michael Skarlinski
Manager, Data Science, WW (formerly Weight Watchers)

Michael manages the data science team at WW, and helps develop a range of data products to serve the membership. Some of his team’s current initiatives include: social media feed recommenders, membership models, recipe recommenders and member identity resolution models.

Tim Stacey
Director, Data Science, Adlumin

Tim Stacey is the Director of Data Science for Adlumin Inc, a cybersecurity software firm based in Arlington, VA. At Adlumin, his primary focus centers on user behavior analytics. His experience includes designing analytics for the IoT domain and natural language processing.

Bisakha Peskin
Lead Data Scientist, Assistant Vice President, Massachusetts Mutual Life Insurance Company

Bisakha Peskin is a Lead Data Scientist at the MassMutual Life Insurance Company. She has a PhD in Systems and Computational Biomedicine from the New York University School of Medicine, a Masters in Computer Science from the Johns Hopkins University. In her doctoral work, she focused on building methods for predictive modeling from multimodal, biomedical data. In the past, she worked as a software developer at the Johns Hopkins Hospital and also spent a summer at the Digital Intelligence team at J P Morgan Chase.

Hwa Jong Kim
Professor; CEO, Kangwon National University; Lab Venture

Hwa-Jong Kim is a professor of Computer Engineering Department and Head of the Data Analytics Center at Kangwon National University in South Korea. He wrote the prize-winning textbook, “Introduction to Data Science”. He is consulting Data Scientist Development program in many industries including KEPCO, LG and LS.

Ande Stelk
Technical Account Manager, SAS

Began my career in retail working in a variety of store, district and corporate positions in analyst and operational roles. I ended my 20 years in retail working in IT managing a data sciences team and an engineering team owning all BI and Analytic applications For the past 7 years I have work with a wide variety of SAS Institute customers to transition to public and private clouds, containers and identify where SAS and Opensource can compliment each other.