My Adventures in Data Science

Welcome to my blog! My name is Jonathan. Since 2007, I’ve worked as a Biochemistry Officer in the Army. My graduate training was in cellular and molecular pathology, but I’ve conducted research in many areas to meet the Army mission. I’ve mentored graduate medical education residents and fellows in basic and clinical research in the areas of trauma, maternal fetal medicine, reproductive biology, and general medicine. I’ve also served as a Program Director and a Deputy Commander at a military research lab. I was also Co-Chair, Environmental Health and Protection Program Area Steering Committee for the Military Operational Medicine Research Program (MOMRP).

Most of my experience and publications center on ‘omics’ in order to support biomarker discovery and/or early qualification efforts funded by MOMRP, Defense Health Program, Defense Threat Reduction Agency, and other military sources. Currently, I’m a Fellow at the FDA’s Center for Drug Evaluation and Research (CDER) in the Office of New Drugs (OND). I work on data & text mining, natural language processing, statistical analysis, and predictive modeling to support regulatory science. After my fellowship, I’ll likely return to Ft. Detrick to work on advanced development.

What is a Data Scientist?

Although I was very familiar with standard ways to mine data and perform statistical analysis, learning R provided me with the tools and skills to tackle much more sophisticated approaches in data mining, text mining, natural language processing, advanced statistical analysis, predictive modeling, and machine learning.

As for the definition of a Data Scientist, here are two that I like:

– (1) “The definition of “data scientist” could be broadened to cover almost everyone who works with data in an organization. At the most basic level, you are a data scientist if you have the analytical skills and the tools to ‘get’ data, manipulate it and make decisions with it.” — Pat Hanrahan

– (2) “A data scientist is someone who can obtain, scrub, explore, model and interpret data, blending hacking, statistics and machine learning. Data scientists not only are adept at working with data, but appreciate data itself as a first-class product.” — Daniel Tunkelang

My definition:

– (3) “The data scientist wrangles data and conducts sophisticated analyses to develop models that inform decision making.”

Where am I headed as a Data Scientist?

Now is the time for me to blog about what I’m learning on a daily basis and contribute to the online community of tutorials and information that I’ve relied so much on in the past couple of years. Perhaps, I’ll focus on unique problems, and offer periodic updates, summaries, and review on key areas of interest in R programming and data science.

As I begin a new start-up, Data InDeed, my goal is to conduct ‘data-informed’ research, and provide services that help others understand their data to informed decision making. Checkout my list of books related to data science. Many are just simple and inexpensive quick fixes to my lack of knowledge at the time, whereas others are much more foundational. I have definitely relied on community-base blogs to solve most of the difficult problems. Until next time…