VEM-8021 Do we know our data as good as we know our tools? | Devoxx

Devoxx UK 2019
from Wednesday 8 May to Friday 10 May 2019.

   Do we know our data as good as we know our tools?


Big Data & AI
Big Data & AI
Intermediate level

For many of us who are developer turning data scientist, we are always concerned about how to build a model, train it, etc... And yes, we want the best accuracy (close to 99%).

But as every seasoned data scientist will always advise us, we need first and foremost to understand our data, ensure it’s clean and prepared before doing any training on it.

During the conference, we will explore multiple problems occurring during data analysis or preparation and for each a technique to solve them (from a list of them). You will go away with a number of resources to explore at your own pace.

We will cover these categories of problems:

  • dirty data
  • disparate datasets - needing normalisation
  • too much information to process
  • and others…

We will cover some of these techniques:

  • analysis - detecting misleading data, outliers, specific time series issues
  • cleaning - deal with missing/ambiguous values, outliers, generating synthetic data, resampling
  • preparation - using statistical and physics functions, dimensionality reduction, feature selection, resampling

And using different kinds of plots relevant at different stages.

tips   techniques   methodology   Tools  
Subscribe to Devoxx on YouTube
Jeremie Charlet
Jeremie Charlet
From Trackener

Entrepreneur, polyglot developer, with a thrill for learning and passionate about ML/DL.

CTO at Trackener, a tech startup on horse care, in charge of the cloud platform previously backend developer at TNA Gov UK and Worldline by Atos

Mentor, mentee and co-lead of Machine Learning Study group in meet-a-mentor community.

Mani Sarkar
Mani Sarkar
From Independent / Freelancer

Mani Sarkar is a passionate developer mainly in the Java/JVM space, based out of London, UK. A Java Champion, JCP Member, OpenJDK contributor, involved with LJC and other developer communities, @adoptopenjdk and other F/OSS projects. Writes code, not just in Java/JVM hence likes to call himself a polyglot developer. He sees himself working in the areas of core Java, JVM, JDK, Hotspot, Graal, GraalVM, Truffle, VMs, and Performance Tuning. An advocate of a number of agile and software craftsmanship practices and a regular at many talks, conferences and hands-on-workshops - speaks, participates, organises and helps out at many of them. Expresses his thoughts often via blog posts, microblogs (tweets) and other forms of social media.

Make sure to download the Android or iOS mobile schedule.