Big data are precious, but only if we manage to extract information from them. To do this we need to write algorithms that analyse them. When dealing with big data we have to face a lot of challenges. Most of them are technological (storage, computational power), but there is also the aspect of how to efficiently write an analysis algorithm, taking into account all the technological limitations and also the business priorities. The best algorithm on paper often is not the most efficient “on code”. Taking inspiration from my personal journey from theoretical physics to data science, I will explain what it means to approach a problem theoretically and why big data are so tricky to approach also at the theoretical level.