Big Data, Cloud & DevOps

Manage your Data Science project structure in early stage

You need to understand when do you need to have a well-defined project folder structure. Also, do not focus on hyper parameter tuning in the early stage as it may spend too many resource and time. If a solution is confirmed then it should be a good time to finding a better hyper parameter before launching the prediction service. Modularized your processing, training, metrics evaluation functions are important steps to manage to tune. Focus on building a model but not make sure everything works well in an unexpected scenario.

What is the most affordable way to get data science capabilities?

Artificial intelligence, machine learning, natural language processing, sentiment analysis and more are just a few of the techniques which are generically called Data Science. The real question is: what is the best choice for your company regarding these services? Should you train your existing staff, hire data scientists or outsource to a professional organization? There is no single correct answer to these questions, and each entity should start with an evaluation of their expectations and needs. In this article we’ll provide some guidelines to facilitate this decision. 

3 MINUTES READ Continue Reading »

Five Reasons why you should use Cross-Validation in your Data Science Projects

Cross-Validation is a very powerful tool. It helps us better use our data, and it gives us much more information about our algorithm performance. In complex machine learning models, it’s sometimes easy not pay enough attention and use the same data in different steps of the pipeline. This may lead to good but not real performance in most cases, or, introduce strange side effects in others. We have to pay attention that we’re confident in our models. 

7 MINUTES READ Continue Reading »
  • Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Five Principles for Big Data Ethics

    Big data analytics raises a number of ethical issues, especially as companies begin monetizing their data externally for purposes different from those for which the data was initially collected. The scale and ease with which analytics can be conducted today completely changes the ethical framework. We can now do things that were impossible a few years ago, and existing ethical and legal frameworks cannot prescribe what we should do. While there is still no black or white, experts agree on a few principles.

    1 MINUTES READ Continue Reading »

    Word embeddings: exploration, explanation, and exploitation (with code in Python)

    Word embeddings discussion is the topic being talked about by every natural language processing scientist for many-many years. The idea behind all of the word embeddings is to capture with them as much of the semantical/morphological/context/hierarchical/etc. information as possible, but in practice one methods are definitely better than the other for a particular task. The problem of choosing the best embeddings for a particular project is always the problem of try-and-fail approach, so realizing why in particular case one model works better than the other sufficiently helps in real work.

    11 MINUTES READ Continue Reading »

    Fourteen Golden Nuggets to Demystify Data Science for Aspiring Data Scientists

    Demystifying Data Science, a free conference for aspiring data scientists and data-curious business leaders, was designed to provide insight on the training, tools, and career paths of data scientists, the conference was fully interactive, featuring real-time chat, worldwide Q&A, and polling. 14 speakers presented live before taking questions submitted via the real-time conference chat feature. The talks cover a wide range of topics: from showcasing your work to connecting with data leaders, from telling a persuasive data story to debugging myths in data science.

    7 MINUTES READ Continue Reading »

    Unsupervised learning demystified

    Unsupervised learning may sound like a fancy way to say “let the kids learn on their own not to touch the hot oven” but it’s actually a pattern-finding technique for mining inspiration from your data. It has nothing to do with machines running around without adult supervision, forming their own opinions about things.  Unsupervised learning helps you find inspiration in data by grouping similar things together for you. There are many different ways of defining similarity, so keep trying algorithms and settings until a cool pattern catches your eye. Let’s demystify!

    4 MINUTES READ Continue Reading »

    Four Ways to fail a Data scientist Job interview

    Hiring a data scientist actually can be excruciatingly painful for companies. It’s an equally big deal for aspirants to bag that perfect offer in core data science, one which is not just a glossed-up, namesake role. One evolves through various incremental stages of expertise to become a productive data scientist. For companies trying to identify one, it’s like finding a needle in the haystack. For any aspiring data scientist or one looking to move up jobs, these are clear pitfalls to be avoided.

    4 MINUTES READ Continue Reading »

    Understanding The Different Job Roles In Data Science

    To make a career in data science, is both challenging and overwhelming. There are plenty of courses online that claim to help you improve your skills in making the most out of the technology, and setting your career on the right track. But none of it can really happen unless you understand all the different

    1 MINUTES READ Continue Reading »

    Why Writing Skills Matters When Analyzing Big Data

    Enterprise leaders need big data specialists who are creative, can collaborate with others, are skilled in research and possess exceptional writing skills.  Enterprise leaders go about fulfilling their daily responsibilities, monitoring and evaluating organizational progress, coming up with new ideas – then, they do it all again. As this process continues its cycle and the business landscape continues to evolve, the role of written communication for big data projects will remain a mission-critical asset.

    3 MINUTES READ Continue Reading »

    Six Reasons why Data Visualisation projects Fail

     With a boatload of visualisation tools at disposal and fancy data scientists to play with them, impactful use of data visualisation is still a rarity in enterprises. Visualization should be seen as a medium of story telling using data. A visual story is a perfect blend of art and science. Practitioners must hone their skills to fuse the right aesthetic ingredients with scientific elements. This creates an output that is relevant for users, solves a specific business challenge and delivers ROI for enterprises.

    6 MINUTES READ Continue Reading »

    How Outdated Technology Can Affect Your Business

    Technology evolves at an alarming rate, so much so it seems almost pointless to make a push for the leading edge. By the time you adopt new technologies and systems into your business environment, there’s something new and more efficient coming along right behind it.  Applications or devices receive an official update, and you just have to take the time to configure. It’s easy to get caught up in either scenario, resulting in the regular use of outdated, inefficient technologies. But what you may not know is that this issue can have a profound impact on your business, productivity and workflow.

    4 MINUTES READ Continue Reading »