Prime Hints For Running A Data Project In R

I’ve been asked more and more for hints and best practices when working with R. It can be a daunting task, depending on how deep or specialised you want to be. So I tried to keep it as balanced as I could and mentioned point that definitely helped me in the last couple of years. Finally, there’s lots (and I... [Read More]

Trump Vs Clinton Interpretable Text Classifier

I’ve been writing/talking a lot about LIME] recently: in this blog/ at H20 meetup, or at coming AI Congress and I’m still sooo impressed by this tool for interpreting any, even black-box - algorithm! The part I love most is that LIME can be applied to both image and text data, that was well showcased in husky VS wolf (image)... [Read More]

End Of Year Thoughts

Sometimes it’s worth making New Year resolutions… A year ago I made one for 2017 to start an R blog using RMarkdown and Jekyll static sites. At the time, I didn’t even know git that well, had no clue what static sites are and was mostly oblivious to the rich and vibrant R community on Twitter. Fast-forward one year and…... [Read More]

Star Wars Vs Star Trek Word Battle

It will go without saying that I’m super excited about the premiere of another Star Wars movie and I’m not an exception. This, together with with Piotr Migdal’s challenge posted on Data Science PL group on Facebook where he suggested comparing word frequencies between two different sources. It didn’t take me long to decide what source to choose! So in... [Read More]

Automated and Unmysterious Machine Learning in Cancer Detection

I get bored from doing two things: i) spot-checking + optimising parameters of my predictive models and ii) reading about how ‘black box’ machine learning (particularly deep learning) models are and how little we can do to better understand how they learn (or not learn, for example when they take a panda bear for a vulture!). In this post I’ll... [Read More]