Mid Level Data Scientist (W2 Candidates Only)

  • Tech Tandem Inc
  • Austin, Texas
  • Full Time
Mid Level Data Scientist (W2 Candidates Only) Remote Posted 3 days ago Updated 3 hours ago Contract W2 6 Months Remote $60 - $65/hr Tech Tandem Inc Fitment Dice Job Match Score Reticulating splines... Job Details Skills Python AgenticAI Agentic AI AI Agent Agentic-AI LLM LLMs Bedrock Foundation Models Data Science Modelling Data Modelling AWS Lambdas Summary Job Role: Senior/Mid-Level Data Scientist Location: 100% Remote Role If you're interested, please send me a copy of your resume and the following details as soon as possible Overview: At a high level, they are automating workflows in various areas of their Media Operations. As part of media campaigns, they pay Google & Meta to bring traffic to their site and need a way to track the effectiveness of the traffic hits they're getting. Google/Meta have options to send feedback on users to them (i.e. how "useful" was this user). This is easier for an ecommerce company to measure - this user spend $x on our products after being directed from Google/Meta and thus was useful/not useful to us. RVO has a much less clear-cut means to provide this feedback. They essentially need to determine whether the user who was directed to them has the particular disease for which the advertisement was shown (was it relevant to this specific user?) - measuring the prevalence of the disease they have to determine if the campaign is effective. Their audience quality data comes from a third party and is received in PDF, email, unstructured data sources that are very messy. They are developing an AI-automated solution to process and ingest that data into a tabular format in Databricks (part 1); and then they are building classic data science models to determine what users are doing on their site at an aggregate level and build a model that compares on-site behavior to audience quality (part 2). Their tech stack is: Databricks for housing and processing data, AWS for foundation models and AI orchestration (Lambdas), Python, Jira, Asana, Snowflake is used in other parts of the org, but not being used for this project, Cursor or Claude Code for code development/enhancement. Data Engineering experience is not required, may be good to have for the nature of the agents they are building, but the data engineering side will build out the actual schemas, fact and dimension tables in Databricks, help them understand the partitioning and indexing. They should have worked with large datasets (billions of rows, roughly 100s of GBs or several TBs of data). My immediate goal is to build an automated process, using AI native tools, be able to process the messy data from unstructured formats and process, load into Databricks tables. Future state would be to where they could go directly from PDF, extract the necessary data and load it into a table. Want to deploy on-site and in real time - will involve integration with engineering team and in house tech stack. This role will not be responsible for any platform components, or for building APIs to deploy the models into production. Their goal will be to get to a deployable model so more Data/AI Science in that sense, but they do not want code only in notebooks - should be production-grade code. Background will be a blend of agentic AI experience and traditional data science. AI agent will need to: Extract key fields from the unstructured sources (PDF, email, attachment, etc) such as audience quality metrics, disease prevalence rates, campaign performance data and convert into structured schema (Databricks tables). Standardize and clean data - normalize formats across document types, resolve naming inconsistencies, handle missing data/missing fields Enrich data - joining extracted data with campaign data, Google/Meta data, audience demographic data from 3rd party. Must Have: Python Agentic AI experience LLMs (Bedrock foundation models) Data science modeling experience Databricks or Snowflake Experience working with large datasets (100s of GBs to TBs, billions of rows) AWS (Lambdas for orchestration) Nice to Have: Document AI/OCR familiarity Experience in data engineering - unstructured data processing, data cleaning, standardization, joining/enriching datasets Employers have access to artificial intelligence language tools (AI) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity. Dice Id: 91172969 Position Id: 8979896 Posted 3 days ago Company Info About Tech Tandem Inc Founded in Austin, Texas, Tech Tandem helps corporations hire qualified candidates for their many recruitment needs. With global resources in Bangalore, India, we provide comprehensive staffing solutions worldwide. Our Mission Tech Tandem is a leading provider of IT staffing and strategic talent solutions. Our mission is to help organizations build high-performing teams by providing top-quality talent, industry expertise, and comprehensive support services. Our Vision To be the preferred strategic partner for our clients in building high-performing IT teams that drive business success through innovative and tailored staffing solutions.
Job ID: 523044324
Originally Posted on: 5/30/2026

Want to find more Technology opportunities?

Check out the 164,721 verified Technology jobs on iHireTechnology