Keynote Speakers


Dr. Feryal Ozel

Professor of Astronomy & Astrophysics
University of Arizona, Department of Astronomy

Feryal Ozel is a Professor of Astronomy and Physics at the University of Arizona. Dr. Ozel's primary research interests are the physics of black holes, neutron stars, and theoretical high-energy astrophysics. She is Chair of NASA Astrophysics Advisory Committee, Chair of NASA Lynx Large Mission Concept Study, and the Lead of Event Horizon Telescope Modeling and Analysis Group. She received her PhD in astrophysics from Harvard University and was a Hubble Fellow at the Institute for Advanced Study in Princeton. She was awarded the American Physical Society Maria Goeppert Mayer Prize, was elected to the Science Academy in Turkey in 2013, and was elected a Fellow of the American Physical Society in 2015. She has been awarded a Radcliffe Fellowship, Guggenheim Fellowship, and Visiting Miller Professorship at UC Berkeley. In 2019, she shared the Breakthrough Prize and the National Science Foundation Diamond Achievement Award with the Event Horizon Telescope collaboration for taking the first image of a supermassive black hole. Dr. Ozel serves on numerous national committees and advisory boards in astrophysics and appears in science programs and documentaries worldwide.

Session Abstract

Astrophysical experiments of the current era generate the largest volume and rate of data encountered anywhere in the world. For example, the Event Horizon Telescope, which has taken the first image of a black hole in the nearby galaxy M87, has recorded 5 Petabytes of data over a 5-night observing campaign, while the Large Synoptic Survey is poised to image the entire sky at a high cadence and gather 20 Terabytes per night that needs to be processed in real-time. Different datasets pose different types of challenges, ranging from identifying weak signals to making robust statistical models. I will describe various problem-specific hardware and algorithmic solutions developed in the astrophysics community for these use cases and show that one approach does not fit all.


Jenny Bryan

Software Engineer & Data Scientist

Jenny Bryan is a Software Engineer and Data Scientist at RStudio and an Adjunct Professor of Statistics at the University of British Columbia with a Ph.D. in Biostatistics from the University of California, Berkeley. She works on a team that develops open source R packages to make data science faster, easier and more fun.

Session Abstract

Coming soon.

carol willing

Carol Willing

Willing Consulting

Carol Naslund Willing serves on Project Jupyter’s Steering Council and works as a Core Developer on JupyterHub and mybinder.org. She serves as a co-editor of The Journal of Open Source Education (JOSE) and co-authored an open source book, Teaching and Learning with Jupyter.

She is a member of Python’s inaugural Steering Council and a core developer of CPython. She’s a Python Software Foundation Fellow and former Director. In 2019, she was awarded the Frank Willison Award for technical and community contributions to Python. With a strong commitment to community outreach, Carol co-organizes PyLadies San Diego and San Diego Python User Group.

Carol has an MS in Management from MIT and a BSE in Electrical Engineering from Duke University.

Session Abstract

Coming soon.

Fireside Chat

anna counselman

Anna Counselman

Co-founder - Upstart
Building a data science driven company.

Anna Counselman is co-founder and head of people and operations at Upstart, a leading AI lending startup with more than $160M in venture capital funding. Anna co-founded Upstart in 2012, after a career at Google leading operations. She helped scale Upstart from zero to 200+ employees and originate more than $3.5B in loans. At Google, Anna led Gmail Consumer Operations for 5 years as it scaled from 150 million to 450 million users and launched the global Enterprise Customer Programs team. She also held a variety of operations roles at McMaster Carr and several other startups. Anna graduated Summa Cum Laude from Boston University with a BA in Finance and Entrepreneurship. Anna received a White House Champion of Change award and was recognized as one of Silicon Valley Business Journal's 40 under 40.


Falon Donohue

CEO - VentureOhio
Building a data science driven company (Moderator)

Coming Soon.

Session Abstract

Coming soon.

Platforms & Process: Tools, Data, Infrastructure

Jordan Hagan

Jordan Hagan

Senior Data Scientist - Miner & Kasch
Optimizing Data Pipelines: Why SQL is Underrated

Jordan Hagan is a Senior Data Scientist with Miner & Kasch. She started her career in 2010 as a government subcontractor working with the OIG/DOJ providing analytic insights to support Medicare Part D fraud, waste, and abuse cases. With expansive healthcare domain knowledge, Jordan has worked closely with hospitals across the United States to build data-driven systems to improve patient outcomes, parse surgeons’ notes, and streamline provider workflows to ensure the highest quality of care. Recently expanding beyond healthcare, she has built workflows to help companies identify customers for personalized retention and marketing efforts powered by recommendation engines and prediction algorithms. Residing in Denver, Colorado, Jordan spends her free time exploring the mountains and sampling great beer.

Session Abstract

This talk is targeted at the many Data Analysts, Data Scientists, and Software Engineers that prefer to write as little SQL as possible and perform analysis and data manipulation in memory. SQL and databases can do a lot of heavy lifting, speeding up data pipelines. Pulling large volumes of data into memory can hinder workflows if the SQL and programming language of choice are not optimized properly. Knowing SQL best practices and what languages to use when will ease your data manipulation and extraction processes.


Sara Daqiq

OAuth and OIDC - Okta
Securing your ML Assets: OpenID Connect

Sara lives to build great products for businesses; a no-nonsense developer, she currently works at Okta - an identity management company - as a developer support engineer. In this role, she speaks with developers and helps them with the implementation and workflow of their SDKs. Sara enjoys helping others learn to code; she was a teacher for two summers for Girls Who Code and a code coach at theCoderSchool. Besides, she is passionate about women’s rights, helping women get financially independent, and working with women around the world to help them develop useful skills. In pursuit of this goal, she founded AccessLocal, an organization that teaches underprivileged females in rural Afghanistan literacy and financial planning. Sara graduated from Georgetown College with a Bachelor’s in Information Systems. On her free time, she enjoys bike riding, yoga, and going to the gym.

Session Abstract

How can application developers provide their users with secure authentication without investing a lot of time, and instead focus on building the parts of their app that will drive their business? With OpenID Connect (OIDC), you grant authority to a trusted provider to prove that the user is who they say they are. OIDC is built on top of OAuth 2.0 so it has all functionality of OAuth and more. In this talk, we’ll explore how applications communicate to grant access to resources on behalf of a user via OIDC.


Paige Roberts

Open Source Relations Manager - Vertica
Architecting Production IoT Analytics

In two decades in the data management industry, Paige Roberts has worked as an engineer, a trainer, a support technician, a technical writer, a marketer, a product manager, and a consultant.

She has built data engineering pipelines and architectures, documented and tested large scale open source analytics implementations, spun up Hadoop clusters from bare metal, picked the brains of some of the stars in the data analytics and engineering industry, championed data quality when that was supposedly passé, worked with a lot of companies in a lot of different industries, and questioned a lot of people's assumptions.

Now, she promotes understanding of Vertica, MPP data processing, open source, and how the analytics revolution is changing the world.

Session Abstract

Analyzing Internet of Things data has broad applications in a variety of industries, from smart cities to smart farms, from network optimization for telecoms to preventative maintenance on expensive medical machines or factory robots. When you look at technology and data engineering choices, even in companies with wildly different use cases and requirements, you see something surprising: Successful production IoT architectures show a remarkable number of similarities.

Join us as we drill into the data architectures in a selection of companies like Philips, Anritsu, and Optimal+. Each company, regardless of industry or use case, has one thing in common: highly successful IoT analytics programs in large scale enterprise production deployments.

By comparing the architectures of these companies, you’ll see the commonalities, and gain a deep understanding of why certain architectural choices make sense in a variety of IoT applications.

By studying successful production architectures, you’ll learn to:

- Judge large scale IoT technology choices critically and objectively

- Avoid some of the traps that have cost other companies time and money and caused so many implementations to fail

- Insulate your company from some of the impact of rapid change in data management technology

- Learn from other companies’ mistakes and successes so you don’t have to reinvent the wheel

- Choose an architecture that will help ensure your AI and ML projects make it into production where they have real impact

Advanced & Applied Methods: Machine Learning, Deep Learning, AI, Statistical

Version 2

Julia Silge

Data Scientist - Stack Overflow
Text Mining Using Tidy Data Principles

Julia Silge is a data scientist at Stack Overflow and the author of Text Mining with R. She is both an international keynote speaker and a real-world practitioner focusing on data analysis and machine learning practice. She loves making beautiful charts and communicating about technical topics with diverse audiences.

Session Abstract

Text data is increasingly important in many domains, and tidy data principles and tidy tools can make text mining easier and more effective. In this talk, learn how to manipulate, summarize, and visualize the characteristics of text using these methods and R packages from the tidy tool ecosystem. These tools are highly effective for many analytical questions and allow analysts to integrate natural language processing into effective workflows already in wide use. Explore how to implement approaches such as measuring tf-idf, topic modeling, and building classification models.


Vishakha Lall

Software Engineer - Fidelity Investments
Don’t let Neural Networks intimidate you – Understanding complex networks with simplicity

Vishali is a recent Engineering graduate with a knack for technology! She loves building software that makes an impact and have a keen interest in the geeky world that revolves around data, analytics, cloud and blockchain! She enjoys sparking creativity with the software she builds. Looking at community problems from the perspective of an engineer lets her work on ingenious ideas and solutions. One such idea won her the title of 'Inspiring Innovator' at Anita's Moonshot Codeathon for working on a novel solution to combat traffic problems by encouraging intelligent lane driving. The same algorithm was also published as an IEEE research paper at ICCCNT, 2018. 

Vishakha is a strong advocate of community, sharing knowledge and ideas with fellow tech-enthusiasts and collaborating to build better solutions for the planet. She often applies as a mentor in multiple initiatives to help beginners with their first steps in technology with the aim to galvanize them to think creatively. At work, she collaborates on and builds optimized solutions for complicated graph network problems.

Session Abstract

The session is a must-visit for any Data Science enthusiast with an interest in Computer Vision, Object Detection, Identification and Segmentation, Neural Networks and the like. There is no required level of proficiency in any of the above although an overview and beginner level understanding of concepts would be helpful. The talk would use a specific example of Object Segmentation using a complicated Neural Network model with the aim that attendees would extrapolate the knowledge to several independent applications. Neural Network models often get really complex to understand as and when more layers are added, and it gets challenging for a beginner to get the hang of it. The session would, therefore, focus on how and when to add layers, parallelism and understand the in-depth working of the model. During the session, we would work on a live demonstration through a Python notebook and work through every step together to understand better. I consider myself as a rookie too, and that I believe would encourage all beginners in the crowd to believe in themselves and not just work on complicated solutions but understand them in and out too!

Driving Value & Uncovering Insight: BI, Data Visualization, Storytelling

Helen Pollitt Profile

Helen Pollitt

Head of Digital - Avenue Digital
How to win friends and influence people with reports

Helen is Head of Digital at Avenue Digital, with over ten years of experience in digital marketing and analytics. She has worked with a variety of international corporates, small local businesses and start-ups to develop a holistic digital marketing strategy that relies on measurement and data. Helen has spoken across the globe and written for leading online publications about digital marketing and analytics.

Session Description

Reports are created to serve the purpose of communicating insight, but all too often they are an overlooked source of marketing potential.

This talk will highlight how to use your existing reports to gain budget and buy-in from stakeholders, to promote your work and make sure the value of your work is known.

Key takeaways will include:

- Learning how to articulate through your reports so anyone will understand your data

- using your reports for your own marketing purposes

- discovering how to tell a story with data

Learn from a marketer about how to get your reports to work for you beyond communicating what the data says.

Trust & Governance: Ethics, Governance, Policy, Risk


Valeria Cortez Vaca Diez

Data Scientist - Lloyds Banking Group
Detecting Discriminatory Outcomes in Machine Learning Models: A Case Study of Credit Model

Valeria is a Data Scientist at Lloyds Banking Group, specializing in the design and build of scalable Machine Learning solutions for different business areas of LBG and their customers. Her current work at LBG focuses on building tools and processes to detect and mitigate bias in ML models.

Before working at LBG, Valeria studied Business Informatics and Technology Management in Germany. For her final dissertation, she led a study on the economics of privacy at Microsoft Research in Cambridge to understand the trade-offs between reward and disclosure of personal information across cultures.

After finalizing her studies, Valeria worked with the startup TAB as a product manager to build the most comprehensive analytics platform on P2P lending and crowdfunding. During this time, she discovered her passion for Data Science, which led her to take a year to complete a Master’s degree in Business Analytics at Imperial College London. As part of her final project, she researched on discriminatory outcomes in machine learning to analyze unfair treatments in credit models.

Valeria is a strong advocate of ethics and responsibility in AI as well as bringing more diversity into tech teams.

Session Description

Machine Learning and AI are considered by many as techniques free of personal judgment and biases. However, there is significant evidence that proves the opposite, with these methods leading to harmful discrimination and potentially causing long-lasting negative impacts on society.

Policing, hiring and lending are some of the many areas where Machine Learning has harmed disproportionately the most vulnerable groups in our society. Understanding unfair treatment in AI and ML is now crucial to prevent automated discrimination at scale.

The fundamental techniques to analyze and detect bias in Machine Learning decision can be explained through simple metrics applied to model outcomes. The aim of this presentation is to pass this knowledge so that any person, in a technical or non-technical role, can be empowered to challenge how Machine Learning is implemented.

In this talk, I will first focus on how machine learning models make decisions that affect single individuals. Then, I will explain through some examples of the metrics we can use to understand whether the model’s outcomes are having a negative impact on a group in society. Finally, I will present a case study that applies the topics covered.

Leadership & Strategy: Leadership, Culture, Talent, Strategy


Wendy Anderson

Data & Analytics Group Manager - Intuit
Data as a product feature deliverable, not an afterthought

Wendy Anderson leads the Data and Analytics team for Intuit’s Mint and Turbo consumer finance applications where data is not only the key to measuring business performance, data in the form of user insights are directly integrated into the experience to help users track and understand their financial health.

She has over 15 years of digital analytics experience and has implemented data and analytics platforms for Overstock.com and Sony Entertainment Network prior to joining Intuit.

Session Description

Surprisingly, many new product experiences and capabilities are created without a plan to measure its impact. As a result, the data required to measure success are overlooked: missing data instrumentation or pipelines; misconfigured A/B tests that are inconclusive; data science models without a feedback loop to improve. Decision making is slower.

Data has to be treated as a deliverable, as important as any user-facing feature. In the near future, it will be the foundation of features. At Intuit, we realized that data has to be the center of our ecosystem and applications have to be considered consumers of the data. This required changes in our data infrastructure, processes, and organizational mindset.

Three key elements are necessary for this change: the decision plan, the data pod, and organizational discipline. Decision plans ensure that success criteria are defined upfront and provide a roadmap for data requirements. Data pods are virtual scrum teams that own the quality and availability of data for their feature and include: product manager, application developer, architect, data analyst, data engineer, and data scientist. Lastly, the organizational commitment to delivering data as part of the product feature is required to reinforce data quality and completeness.

Industry & Innovation: Use Cases, Emerging Trends

Abigail Baldridge, MS, Preventive Medicine

Abigail Baldridge

Assistant Director of Research - Northwestern University
A Practical Approach to Reproducibility in Academia and Beyond

Abigail (Abi) Baldridge is a research director and biostatistician in the Center for Global Cardiovascular Health and Department of Preventive Medicine, Feinberg School of Medicine at Northwestern University. She is an experienced public health leader with history of working in academic research, pharmaceutical and medical devices industries as an engineer and biostatistician. Abi is skilled in project management, data science, data visualization, biostatistics, clinical research, and teaching within higher education. She works daily in SAS, Stata, and R, and places particular personal emphasis on promoting and adhering to practices for reproducible research. Abi is currently pursuing a doctoral degree in public health at Johns Hopkins Bloomberg School of Public Health with a focus on implementation science.

Session Description

Reproducibility, wherein data analysis and documentation are sufficient so that results can be recomputed or verified, is an increasingly important component of statistical practice. Research is more efficient and robust when research teams can easily recreate and reproduce findings using original data. However, adopting reproducible research workflows can be daunting due to technical barriers, a perceived need to switch away from a favorite software, or the impression that reproducible research is an “all-or-nothing” endeavor.

In the first half of this session, we will explore how to approach reproducible research: steps for starting small, expanding capability, and both technical and non-technical strategies to help along the way. This session will include a broad overview of tools and software for source code control, electronic laboratory notebooks, containers, and manuscript preparation tools.

In the second half of this session, we will take a deeper dive into manuscript preparation and dynamic documents, with a focus on StatTag. StatTag is a free, open-source program that embeds statistical results from R/R Markdown, SAS, Stata, or Python directly in Microsoft Word. With StatTag, results inserted into a Word document can be updated automatically or on-demand, and retain their linkage to the code even when the document changes hands, is redlined, or the text is copied and pasted elsewhere.

This session is well suited for analytics professionals with any level of expertise.


Leigh Stauffer

Software Engineer - Mobikit
Challenges in Mobility and Telematics Data

Coming Soon.

Session Description

Coming soon. 



Lea Pica

LeaPica.com | Analytics Presentation & Data Visualization Consulting

Coming Soon.

Workshop Description

6-Hour Data Storytelling Masterclass


Ruth Milligan

Founder / Managing Director

As an executive speech coach and trainer, Ruth Milligan now lives at the intersection of deep knowledge fields like science and research to medicine and data / analytics married with speakers’ desires to be highly resonating and engaging. This is a culmination her nearly 30 year career practicing some form of communications. Ruth founded Articulation in early 2010 after hosting one of the first TEDx events, TEDxColumbus, which is also now one of the longest running TEDx programs in the world. Since then she and her team have coached over 500 people in TEDx or TED-style talks alongside training thousands in their original classes on content framing, storytelling, public speaking, executive presence and accessing science. Her processional passion is to help organizations of all sizes create storytelling cultures that elevate the opportunities for associates and executives to practice and deliver great presentations. She believes that a great talk can come from a nervous, reluctant or beginning speaker, given the right feedback and development environment. After ten years of curating, organizing and hosting TEDx events, she is also a seasoned host, emcee and consultant to a wide range of events from major donor events at universities to pitch events inside data and analytics teams. Ruth has been the official speaker coach for the Women in Analytics presenters since 2017 and helped to host the main stage presenters in 2018. She lives in Columbus with her husband and two teenage children.

Workshop Description

While data increases it’s influence in making business decisions and driving strategies, sharing insights requires the understanding of story to make the data come to life. Story means the way in which we reveal the problem, the solution and the ultimate impact that data will inform. This class will focus on how to take a basic data set and turn it into a compelling story that will resonate with your business partners. Bring a data set you are working on to apply during class.

Note: This class will not address data visualization.


Sandy Steiger

Director of the Center for Analytics and Data Science
Miami University

As Director of the Center for Analytics and Data Science, Sandy Steiger is responsible for providing co-curricular experiences that allow students to embrace the application of analytics and data science. Prior to joining Miami University, Steiger spent 15 years at 84.51° and dunnhumbyUSA. Her most recent role was Vice President, Data Science & Analytics, where she was responsible for developing a strategic roadmap that would bring transformational, innovative thinking to 84.51°, focused on driving data science at scale across the Enterprise. Steiger holds a Master of Science in Statistics from Miami University, Ohio, and a Bachelor of Arts in Mathematics and Business from Mount St. Joseph University.

Workshop Description

While data increases it’s influence in making business decisions and driving strategies, sharing insights requires the understanding of story to make the data come to life. Story means the way in which we reveal the problem, the solution and the ultimate impact that data will inform. This class will focus on how to take a basic data set and turn it into a compelling story that will resonate with your business partners. Bring a data set you are working on to apply during class.

Note: This class will not address data visualization.


Ezgi Karaesmen

PhD Student - The Ohio State University

Ezgi Karaesmen is a PhD candidate at the Ohio State University College of Pharmacy. She is a genomic data scientist with cancer biology background. Currently, she works with large genomic and clinical datasets in the context of bone marrow transplants. Broadly, she is interested in associations of germline genetic variants with survival events of leukemia patients following their transplant.

Workshop Description

Coming Soon.

Katie Sasso

Katie Sasso-Schafer

Director of Data Science
Columbus Collaboratory

Coming soon.

Workshop Description

Coming Soon.

Women in Analytics Speaker

Speaker Bureau

Find speakers from previous WIA events.
View content, slides, videos.

View Here
Call for Speakers Women in Analytics Conference

Call for Speakers

Interested in sharing your expertise?
Apply by February 29, 2020.

Apply to Speak