Skip to content

Latest commit

 

History

History
65 lines (34 loc) · 5.82 KB

Neo4J-JDillon.md

File metadata and controls

65 lines (34 loc) · 5.82 KB

MA4128 Assessment

What is Data Science?

Data Science is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, which is a continuation of some of the data analysis fields such as statistics, data mining, and predictive analytics.

At its core, data science involves using automated methods to analyze massive amounts of data and to extract knowledge from them. With such automated methods turning up everywhere from genomics to high-energy physics, data science is helping to create new branches of science, and influencing areas of social science and the humanities. The trend is expected to accelerate in the coming years as data from mobile sensors, sophisticated instruments, the web, and more, grows. In academic research, we will see an increasingly large number of traditional disciplines spawning new sub-disciplines with the adjective "computational" or “quantitative” in front of them. In industry, we will see data science transforming everything from healthcare to media. (http://datascience.nyu.edu/what-is-data-science/)

Data scientists use their data and analytical ability to find and interpret rich data sources; manage large amounts of data despite hardware, software, and bandwidth constraints; merge data sources; ensure consistency of datasets; create visualizations to aid in understanding data; build mathematical models using the data; and present and communicate the data insights/findings. They are often expected to produce answers in days rather than months, work by exploratory analysis and rapid iteration, and to produce and present results with dashboards (displays of current values) rather than papers/reports, as statisticians normally do. (https://www.import.io/post/data-scientists-vs-data-analysts-why-the-distinction-matters/)

Attributes Required for Data Scientists

Data Science website Quora has listed a number of characteristics that they believe important for people wishing to become a successful data scientist. They are as follows:

  • A Constant Pursuit of Learning

There's so much to learn as a data scientist that one has to be comfortable admitting what one doesn't know and seeking to bridge the knowledge gap. This is such a new field that covers so many skills and so many domains that it can be very daunting to start.

  • An Insatiable Curiosity

A data scientist's job is to answer deep questions using data and to gather insights to improve the business or the product. This is facilitated by a natural gnawing curiosity about the product, when you can form your own questions about the data that you have the power to answer.

  • An Optimistic Stubbornness

There's going to be a lot of insights that are just slightly out of reach, or are annoying or frustrating to get. A bit of stubbornness (combined with an insatiable curiosity) can help a data scientist overcome some little barriers to find those little gem in the rough.

  • An Ability to Prioritize

There's going to be so many possible insights to find, so many features that need to be built, so many analyses that need to be done, especially if you're working at a smaller company and covering a very large surface area. The best data scientists will focus on what can lead to the most impact and be comfortable with saying no to everything else (for now).

  • A Practical Bent

Unfortunately, in many applications of data science there's no time for perfection. A data scientist has to maximize impact rather than accuracy - and that can mean that it's better to learn about 5 things 80% of the way rather than 1 thing 95% of the way. One has to be comfortable with doing something quickly and moving on, if need be.

  • A Healthy Dose of Skepticism

Unfortunately, many of the most surprising results that a data scientists will find are the result of a bug in their code (or someone else's code). Additionally, there are so many potentially wrong insights that a data scientist has to be wary of, from overfitting to spurious results to just plain wrong statistics. Data scientists always have to double-check their work.

  • An Impact-Driven Mindset

The quality of a data scientist is measured by her ability to execute on techniques and provide value to the product or business. There's little credit given for ideas with execution or theoretical improvements that just aren't practical (yet). This is one large difference between an academic and a data scientist.

  • A Habit of Dependability

A data scientist is the feedback loop in product decisions in data-driven companies - she is responsible for advising and proposing new features and initiatives, designing and analyzing experiments that can measure impact, and summarizing learnings so knowledge is preserved. She needs to communicate expectations well so the feedback loop is well-lubricated.

  • An Intuition Steeped in Domain Knowledge

There are going to be a lot of metrics that will be wrong and a lot of metrics that will drop. A strong product intuition based on prior experience can identify what's wrong and generate good hypotheses on why things dropped. Domain knowledge will give any data scientist an advantage on diagnosing problems and figuring out where to start.

  • A Love for Storytelling

Finally, portraying the data in an enthusiastic and intriguing manner is an essential quality for any data scientist. An audience needs to be engaged in order to fully understand and appreciate the tale that the data is telling.

(https://www.quora.com/What-are-the-key-traits-of-a-data-scientist)

Neo4j

Create New Products and Services

  • Harvest new market opportunities by creating products and services that leverage data relationships.
  • Beat your competitors to market, reduce churn, and achieve company vision with the new applications you can create with Neo4j.