1. Excel output (see attached) 2. Codes a. For R codings: Rmarkdown and html file of your work b. Python: Jupyter notebook and html file of your work 3. Any findings you wish to p

This is the technical challenge. Make sure you answer all the questions and explain your thought process of how you approach them. File format:
  1. Excel output (see attached)
  2. Codes
    1. For R codings: Rmarkdown and html file of your work
    2. Python: Jupyter notebook and html file of your work
  3. Any findings you wish to provide (word document)
Description The customer identification analysis effort is to identify customers (e.g. key opinion leaders, treating physicians, directors, etc.) with a great impact in the specific therapeutic area or disease state that we can engage with for different objectives. Information is extracted from different data sources, including: External data source: Community Liver Alliance

In this project we are trying to locate key institutions, key local support groups, and contacts that are listed as key resources to support patients with liver cancer.

That being said, use this resource page to get answers for your exercise to programatically:

  • Find local health systems and support groups for the entire US. Pull information associated with individuals, contact, locations. Here’s an example

    • Under US state map (or state link), I have selected northern california

      • Here’s the link associated with california

      • Go to health system or support groups (you will see information associated with county, institution, address, contact info (phone/email) as well as a contact person (in some cases). For example

      • 1.    Excel output (see attached)  2.    Codes  a.    For R codings:  Rmarkdown and html file of your work  b.     Python: Jupyter notebook and html file of your work  3.    Any findings you wish to p 1

      • Note - we don’t need to know the information on when they meet

  • Explain your approach

    • How do you make sure records you find are related to the subject we are looking for?

  • Split the data sets into it’s appropriate heading. For example, Full Name into First, Middle and Last Name. Email address, Address information (address, city, state, zip code).

  • Final Deliverable - we want an excel dataset of unique records of health systems support group with the following columns.

Here’s an example -

Full Name

First Name

Middle Name

Last Name

Credential*

Position*

Support Group Name

Address*

City*

State*

County

Email*

Phone*