metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:809
- loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/all-distilroberta-v1
widget:
- source_sentence: Data pipeline architecture, Azure Data Factory, Apache Spark
sentences:
- >-
Experience »
Prior experience working on a SAP ECC to SAP S4 Hana Migration
Project.4+ years in an ETL or Data Engineering roles; building and
implementing data pipelines and modeling data.Experience with SAP data
and data structures.Experience managing Snowflake instances, including
data ingestion and modeling.Experience with IBM DataStage is a plus.Very
strong skills with SQL with the ability to write efficient
queries.Familiarity with Fivetran for replication.
What You’ll Do
Job requirements are met.Perform data analysis required to troubleshoot
data related issues and assist in the resolution of data issues.
Interested?
Qualified candidates should send their resumes to
[email protected]
V-Soft Consulting Group is recognized among the top 100 fastest growing
staffing companies in North America, V-Soft Consulting Group is
headquartered in Louisville, KY with strategic locations in India,
Canada and the U.S. V-Soft is known as an agile, innovative technology
services company holding several awards and distinctions and has a wide
variety of partnerships across diverse technology stacks.
As a valued V-Soft Consultant, you’re eligible for full benefits
(Medical, Dental, Vision), a 401(k) plan, competitive compensation and
more. V-Soft is partnered with numerous Fortune 500 companies,
exceptionally positioned to advance your career growth.
V-Soft Consulting provides equal employment opportunities to all
employees and applicants for employment and prohibits discrimination and
harassment of any type without regard to race, color, religion, age,
sex, national origin, disability status, genetics, protected veteran
status, sexual orientation, gender identity or expression, or any other
characteristic protected by federal, state or local laws.
For more information or to view all our open jobs, please visit
www.vsoftconsulting.com or call (844) 425-8425.
- >-
experiences that leverage the latest technologies in open source and the
Cloud. Digital Information Management (DIM) is a team of engineers
committed to championing a data-driven decision-making culture and meets
the business demand for timely insight-focused analytics and information
delivery.
You will be working with all levels of technology from backend data
processing technologies (Databricks/Apache Spark) to other Cloud
computing technologies / Azure Data Platform. You should be a strong
analytical thinker, detail-oriented and love working with data with a
strong background in data engineering and application development. Must
be a hand-on technologist passionate about learning new technologies and
help improve the ways we can better leverage Advanced Analytics and
Machine Learning.
Responsibilities
Build end-to-end direct capabilities.Create and maintain optimal data
pipeline architecture.Build the infrastructure required for optimal
extraction, transformation, and loading of data from a wide variety of
data sources.Use analytics for capitalizing on the data for making
decisions and achieving better outcomes for the business.Derive insights
to differentiate member and team member experiences. Collaborate with
cross-functional teams.Analyze and define with product teams the data
migration and data integration strategies.Apply experience in analytics,
data visualization and modeling to find solutions for a variety of
business and technical problems.Querying and analyzing small and large
data sets to discover patterns and deliver meaningful insights.
Integrate source systems with information management solutions and
target systems for automated migration processes.Create
proof-of-concepts to demonstrate viability of solutions under
consideration.
Qualifications
Bachelor’s degree in computer science, information systems, or other
technology-related field or equivalent number of years of
experience.Advanced hands-on experience implementing and supporting
large scale data processing pipelines and migrations using technologies
(eg. Azure Services, Python programming).Significant hands-on experience
with Azure services such as Azure Data Factory (ADF), Azure Databricks,
Azure Data Lake Storage (ADLS Gen2), Azure SQL, and other data sources.
Significant hands-on experience designing and implementing reusable
frameworks using Apache Spark (PySpark preferred or Java/Scala).Solid
foundation in data structures, algorithms, design patterns and strong
analytical and problem-solving skills.Strong hands-on experience leading
design thinking as well as the ability to translate ideas to clearly
articulate technical solutions. Experience with any of the following
Analytics and Information Management competencies: Data Management and
Architecture, Performance Management, Information Delivery and Advanced
Analytics.
Desired Qualifications
Proficiency in collaborative coding practices, such as pair programming,
and ability to thrive in a team-oriented environment.The following
certifications:Microsoft Certified Azure Data EngineerMicrosoft
Certified Azure Solutions ArchitectDatabricks Certified Associate
Developer for Apache 2.4/3.0
Hours: Monday - Friday, 8:00AM - 4:30PM
Location: 820 Follin Lane, Vienna, VA 22180 | 5510 Heritage Oaks Drive
Pensacola, FL 32526 | 141 Security Drive Winchester, VA 22602
About Us
You have goals, dreams, hobbies, and things you're passionate
about—what's important to you is important to us. We're looking for
people who not only want to do meaningful, challenging work, keep their
skills sharp and move ahead, but who also take time for the things that
matter to them—friends, family, and passions. And we're looking for team
members who are passionate about our mission—making a difference in
military members' and their families' lives. Together, we can make it
happen. Don't take our word for it:
Military Times 2022 Best for Vets Employers WayUp Top 100 Internship Programs Forbes® 2022 The Best Employers for New Grads Fortune Best Workplaces for Women Fortune 100 Best Companies to Work For® Computerworld® Best Places to Work in IT Ripplematch Campus Forward Award - Excellence in Early Career Hiring Fortune Best Place to Work for Financial and Insurance Services
Disclaimers: Navy Federal reserves the right to fill this role at a
higher/lower grade level based on business need. An assessment may be
required to compete for this position. Job postings are subject to close
early or extend out longer than the anticipated closing date at the
hiring team’s discretion based on qualified applicant volume. Navy
Federal Credit Union assesses market data to establish salary ranges
that enable us to remain competitive. You are paid within the salary
range, based on your experience, location and market position
Bank Secrecy Act: Remains cognizant of and adheres to Navy Federal
policies and procedures, and regulations pertaining to the Bank Secrecy
Act.
- >-
Data AnalystDakota Dunes, SD
Entry Level SQL, Run SQL The queries. Client is using
ThoughtspotUnderstanding of Dashbord and Proficient in Microsoft Office
and excel
Please share your profile to [email protected] or reach me on
619 771 1188.
- source_sentence: >-
Customer data management, regulatory compliance, advanced Excel and Access
proficiency
sentences:
- >-
skills, attention to detail, and experience working with data in Excel.
The candidate must enjoy collaborative work, actively participate in the
development of team presentations, and engage in review of other analyst
findings. ResponsibilitiesThe Junior Analyst will be responsible for
examining data from different sources with the goal of providing
insights into NHLBI, its mission, business processes, and information
systems. Responsibilities for this position include:Develop a strong
understanding of the organization, functions, and data sources to be
able to ensure analytical sources and methodologies are appropriately
applied for the data need.Develop clear and well-structured analytical
plans.Ensure data sources, assumptions, methodologies, and visualization
approaches are consistent with prior work by the OPAE.Assess the
validity of source data and subsequent findings.Produce high quality,
reliable data analysis on a variety of functional areas.Explain the
outcome/results by identifying trends and creating visualizations.Use
best practices in data analysis and visualization.Exhibit results,
conclusions, and recommendations to leadership, and customize
presentations to align with various audiences.Document and communicate
analysis results (briefings, reports, and/or backup analysis files) in a
manner that clearly articulates the approach, results, and data-driven
recommendations.Continually assess all current activities and
proactively communicate potential issues and/or challenges.May support
data scientists on various projects. Qualifications Minimum
qualifications:Bachelor’s degree in data science or related
fields.Minimum of 2 years of demonstrable experience in data
analysis.Must have 2 years of experience in using Excel for data
analysis and visualization andWillingness to learn basic data science
tools and methodologies.Intermediate to advanced proficiency with
industry-standard word processing, spreadsheet, and presentation
software programs.Excellent verbal and written communication
skills.Strong attention to detail.Collaborative team player.Proven
problem solving and critical thinking skills.Must be able to obtain
Public Trust Clearance.US work authorization (we participate in
E-Verify). Preferred qualifications:Proficient in the use of basic data
science tools and methodologies (python, SQL, machine learning).MS in
data science or related fields.
Salary and benefitsWe offer a competitive salary and a generous benefits
package, including full health and dental, HSA and retirement accounts,
short- and long-term disability insurance, life insurance, paid time off
and 11 federal holidays. Location: Washington DC, Hybrid
- >-
SKILLS – Very Strong, Microsoft Excel (Pivot Tables, Sumifs, Vlookups
etc), Data manipulation, Logistics and operations terminology Job
SummaryApple AMR Ops Logistics is looking for an experienced Data
Analyst to support its Business Analytics team. This position will be
responsible for ensuring maintenance and frequent updates to Apple’s
internal Shipping Exceptions Management System. The position will work
closely with AMR Logistics stakeholders to ensure timely execution of
daily jobs by transforming data in Excel into Apple’s internal tools.
Key Responsibilities• Review multiple Excel reports and ensure timely
uploads into the Shipping Exceptions Management System• Develop robust
data visualizations that will help to answer commonly asked questions
quickly and thoroughly about Shipping Exceptions• Identify data
anomalies, work to root cause and remediate issues in data collection,
storage, transformation, or reporting Key Qualifications1 – 2 years of
work experience preferredSkilled in Excel and data manipulation
(mandatory)Familiarity with Logistics and Operations
terminologyFamiliarity with Business Objects a plusAbility to create
cross-platform reportsAbility to turn data into information and
insightsHigh-level attention to detail, including the ability to spot
data errors and potential issues in Apple’s internal systems Hard
Skills:Microsoft Excel (Pivot Tables, Sumifs, Vlookups etc)Good Verbal
and Communication skills
- >-
Qualifications:0-2 years relevant experienceAdvanced knowledge of MS
Office Suite, including proficiency in Excel and Access.Consistently
demonstrates clear and concise written and verbal communication
skills.Demonstrated organization skills with an excellent attention to
detail.Ability to focus on high quality work.
Education:Bachelor’s/University degree or equivalent experiencePlease
share with me your updated resume if you are interested in applying for
this role.
Dexian is a leading provider of staffing, IT, and workforce solutions
with over 12,000 employees and 70 locations worldwide. As one of the
largest IT staffing companies and the 2nd largest minority-owned
staffing company in the U.S., Dexian was formed in 2023 through the
merger of DISYS and Signature Consultants. Combining the best elements
of its core companies, Dexian's platform connects talent, technology,
and organizations to produce game-changing results that help everyone
achieve their ambitions and goals.Dexian's brands include Dexian DISYS,
Dexian Signature Consultants, Dexian Government Solutions, Dexian Talent
Development and Dexian IT Solutions. Visit https://dexian.com/ to learn
more.Dexian is
- source_sentence: >-
Clarity PPM reporting, data dashboard customization, performance quality
assurance
sentences:
- >-
skills and the ability to connect and communicate across multiple
departments.Adept at report writing and presenting findings.Ability to
work under pressure and meet tight deadlines.Be able to read and update
project and program level resource forecasts.Identify recurring process
issues and work with managers to find solutions and initiate
improvements to mitigate future recurrence.
Skills and Qualifications:5+ years in a Data Analyst and/or Data
Scientist capacity.5 years of experience with Clarity PPM reporting,
developing data dashboards, charts and datasets in Clarity.Strong
knowledge of and experience with reporting packages (Business Objects,
Tableau, Power BI, etc.), databases (SQL), programming (XML, JavaScript,
etc.).Knowledge of statistics and experience using statistical packages
for analyzing datasets (Excel, SAS, R, SPSS, etc.)High understanding of
PPM disciplines has worked in a team and covered strategic projects.
Experience with Dashboard customization, configuration, user interface
personalization and infrastructure management will be helpful.Strong
analytical skills with the ability to collect, organize, analyze, and
disseminate significant amounts of information with attention to detail,
accuracy, and actionable insights.Excellent communicator, adjusting
communication styles based on your audience.Quick learner, adaptable and
can thrive in new environments.Proactive, confident, and engaging;
especially when it comes to large stakeholder groups.Capable of
critically evaluating data to derive meaningful, actionable
insights.Demonstrate superior communication and presentation
capabilities, adept at simplifying complex data insights for audiences
without a technical background.
- >-
skills and current Lubrizol needs):
Create predictive models by mining complex data for critical formulating
or testing insights Implement and assess algorithms in R, Python, SAS,
JMP or C#/C++ Research and implement new statistical, machine learning
and/or optimization approaches (PhD level)Collaborate with data science
team, as well as, scientists and engineers, to understand their needs,
and find creative solutions to meet those needs
Previous Intern Projects Include
Predictive modeling using Bayesian and machine learning methods R/Shiny
tool development to enable model predictions and formulation
optimization Creation of an interactive visualization tool for
monitoring predictive models Multitask learning (transfer learning)
using co-regionalized Gaussian Processes (PhD level)Multi-objective
optimization using genetic algorithms (PhD level)Survival modeling using
bagged Cox proportional hazards regression trees (PhD level)Bootstrap
variance estimation for complex nonlinear models (PhD level)
What tools do you need for success?
Enrolled in a Masters or PhD program such as statistics, data analytics,
machine learningExcellent programming skills with the ability to learn
new methods quicklyExposure to database systems and the ability to
efficiently manipulate complex data Interest and experience in advanced
statistical modeling/machine learning methods (PhD level)Coursework in
statistical modeling and data mining methodsCuriosity and creativity
Benefits Of Lubrizol’s Chemistry Internship Programs
Rewarding your hard work!Competitive payHoliday pay for holidays that
fall within your work periodFUN! We host a variety of events and
activities for our students. Past events include a Cleveland Cavaliers
game, paid volunteering days, professional development and networking
events, and even a picnic hosted by our CEO!
While headquartered in the United States, Lubrizol is truly a global
specialty chemical company. We have a major presence in five global
regions and do business in more than 100 countries. Our corporate
culture ensures that Lubrizol is one company throughout the world, but
you will find each region is a unique place to work, live and play.
Lubrizol is
- >-
experience with agile engineering and problem-solving creativity. United
by our core values and our purpose of helping people thrive in the brave
pursuit of next, our 20,000+ people in 53 offices around the world
combine experience across technology, data sciences, consulting and
customer obsession to accelerate our clients’ businesses through
designing the products and services their customers truly value.
Job Description
This position requires in-depth knowledge and expertise in GCP services,
architecture, and best practices. Will work closely with clients to
understand their business objectives and develop strategies to leverage
GCP to meet their needs. They will collaborate with cross-functional
teams to design, implement, and manage scalable and reliable cloud
solutions. They will also be responsible for driving innovation and
staying up-to-date with the latest GCP technologies and trends to
provide industry-leading solutions.
Your Impact:
Collaborate with clients to understand their business requirements and
design GCP architecture to meet their needs.Develop and implement cloud
strategies, best practices, and standards to ensure efficient and
effective cloud utilization.Work with cross-functional teams to design,
implement, and manage scalable and reliable cloud solutions on
GCP.Provide technical guidance and mentorship to the team to develop
their skills and expertise in GCP.Stay up-to-date with the latest GCP
technologies, trends, and best practices and assess their applicability
to client solutions.Drive innovation and continuous improvement in GCP
offerings and services to provide industry-leading solutions.Collaborate
with sales and business development teams to identify and pursue new
business opportunities related to GCP.Ensure compliance with security,
compliance, and governance requirements in GCP solutions.Develop and
maintain strong relationships with clients, vendors, and internal
stakeholders to promote the adoption and success of GCP solutions.
Qualifications
Must have good implementationexperience onvariousGCP’s Data Storage and
Processing services such as BigQuery, Dataflow, Bigtable, Dataform, Data
fusion, cloud spanner, Cloud SQLMust have programmatic experience with
tools like Javascript, Python, Apache Spark.Experience in building
advance Bigquery SQL and Bigquery modelling is requiredExperience in
orchestrating end-end data pipelines with tools like cloud composer,
Dataform is highly desired.Experience in managing complex and reusable
dataflow pipelines is highly desired.
What sets you apart:
Experience in complex migrations from legacy data warehousing solutions
or on-prem datalakes to GCPExperience in maneuvering resources in
delivering tight projectsExperience in building real-time ingestion and
processing frameworks on GCP.Adaptability to learn new technologies and
products as the job demands.Experience in implementing Data-governance
solutionsKnowledge in AI, ML and GEN-AI use casesMulti-cloud & hybrid
cloud experienceAny cloud certification
Additional Information
Flexible vacation policy; Time is not limited, allocated, or accrued16
paid holidays throughout the yearGenerous parental leave and new parent
transition programTuition reimbursementCorporate gift matching program
Career Level: Senior Associate
Base Salary Range for the Role: 115,000-150,000 (varies depending on
experience) The range shown represents a grouping of relevant ranges
currently in use at Publicis Sapient. Actual range for this position may
differ, depending on location and specific skillset required for the
work itself.
- source_sentence: Go-to-Market strategy, Salesforce dashboard development, SQL data analysis
sentences:
- >-
experience: from patients finding clinics and making appointments, to
checking in, to clinical documentation, and to the final bill paid by
the patient. Our team is committed to changing healthcare for the better
by innovating and revolutionizing on-demand healthcare for millions of
patients across the country.
Experity offers the following:
Benefits – Comprehensive coverage starts first day of employment and
includes Medical, Dental/Orthodontia, and Vision.Ownership - All Team
Members are eligible for synthetic ownership in Experity upon one year
of employment with real financial rewards when the company is
successful!Employee Assistance Program - This robust program includes
counseling, legal resolution, financial education, pet adoption
assistance, identity theft and fraud resolution, and so much
more.Flexibility – Experity is committed to helping team members face
the demands of juggling work, family and life-related issues by offering
flexible work scheduling to manage your work-life balance.Paid Time Off
(PTO) - Experity offers a generous PTO plan and increases with
milestones to ensure our Team Members have time to recharge, relax, and
spend time with loved ones.Career Development – Experity maintains a
learning program foundation for the company that allows Team Members to
explore their potential and achieve their career goals.Team Building
– We bring our Team Members together when we can to strengthen the team,
build relationships, and have fun! We even have a family company picnic
and a holiday party.Total Compensation - Competitive pay, quarterly
bonuses and a 401(k) retirement plan with an employer match to help you
save for your future and ensure that you can retire with financial
security.
Hybrid workforce:
Experity offers Team Members the opportunity to work remotely or in an
office. While this position allows remote work, we require Team Members
to live within a commutable distance from one of our locations to ensure
you are available to come into the office as needed.
Job Summary:
We are seeking a highly skilled and data-driven Go-to-Market (GTM) Data
Analyst to join our team. The ideal candidate will be adept at
aggregating and analyzing data from diverse sources, extracting valuable
insights to inform strategic decisions, and proficient in building
dynamic dashboards in Salesforce and other BI tools. Your expertise in
SQL and data analytics will support our go-to-market strategy, optimize
our sales funnel, and contribute to our overall success.
Experience:
Bachelor’s or Master’s degree in Data Science, Computer Science,
Information Technology, or a related field.Proven experience as a Data
Analyst or similar role, with a strong focus on go-to-market
strategies.Expertise in SQL and experience with database
management.Proficiency in Salesforce and other BI tools (e.g., Tableau,
Power BI).Strong analytical skills with the ability to collect,
organize, analyze, and disseminate significant amounts of information
with attention to detail and accuracy.Excellent communication and
presentation skills, capable of conveying complex data insights in a
clear and persuasive manner.Adept at working in fast-paced environments
and managing multiple projects simultaneously.Familiarity with sales and
marketing metrics, and how they impact business decisions.
Budgeted salary range:
$66,900 to $91,000
Team Member Competencies:
Understands role on the team and works to achieve goals to the best of
your ability.Working within a team means there will be varying opinions
and ideas. Active listening and thoughtfully responding to what your
team member says.Take responsibility for your mistakes and look for
solutions. Understand how your actions impact team.Provides assistance,
information, or other support to others to build or maintain
relationships.Maintaining a positive attitude. Tackle challenges as they
come, and don’t let setbacks get you down.Gives honest and constructive
feedback to other team members.When recognizing a problem, take action
to solve it.Demonstrates and supports the organization's core values.
Every team member exhibits our core values:
Team FirstLift Others UpShare OpenlySet and Crush GoalsDelight the
Client
Our urgent care solutions include:
Electronic Medical Records (EMR): Software that healthcare providers use
to input patient data, such as medical history, diagnoses, treatment
plans, medications, and test results.Patient Engagement (PE): Software
that shows patients the wait times at various clinics, allows patients
to reserve a spot in line if there's a wait, and book the
appointment.Practice Management (PM): Software that the clinic front
desk staff uses to register the patient once they arrive for their
appointment.Billing and Revenue Cycle Management (RCM): Software that
manages coding, billing and payer contracts for clinics so they don’t
have to.Teleradiology: Board certified radiologist providing accurate
and timely reads of results from X-rays, CT scans, MRIs, and
ultrasounds, for our urgent care clients.Consulting: Consulting services
for urgent care clinics to assist with opening, expanding and enhancing
client's businesses
- >-
experience with Cloud Engineering / Services.3+ years of work experience
as a backend software engineer in Python with exceptional software
engineering knowledge. Experience with ML workflow orchestration tools:
Airflow, Kubeflow etc. Advanced working knowledge of
object-oriented/object function programming languages: Python, C/C++,
JuliaExperience in DevOps: Jenkins/Tekton etc. Experience with cloud
services, preferably GCP Services like Vertex AI, Cloud Function,
BigQuery etc. Experience in container management solution: Kubernetes,
Docker.Experience in scripting language: Bash, PowerShell etc.
Experience with Infrastructure as code: Terraform etc.
Skills Preferred:Master focused on Computer Science / Machine Learning
or related field. Experience working with Google Cloud platform (GCP) -
specifically Google Kubernetes engine, Terraform, and
infrastructure.Experience in delivering cloud engineering
products.Experience in programming concepts such as Paired Programming,
Test Driven Development, etc. Understanding of MLOPs/Machine Learning
Life Cycle and common machine learning frameworks: sklearn, TensorFlow,
pytorch etc. is a big plus.Must be a quick learner and open to learning
new technology. Experience applying agile practices to solution
delivery. Experience in all phases of the development lifecycle. Must be
team-oriented and have excellent oral and written communication skills.
Good organizational and time-management skills. Must be a self-starter
to understand existing bottlenecks and come up with innovative
solutions. Knowledge of coding and software craftsmanship
practices.Experience and good understanding of GCP processing /DevOPs/
Machine Learning
- >-
Skills
Good banking domain background with Advanced SQL knowledge is a MUST
Expert in Advanced Excel functions used for data analysis Ability to Understand Physical and Logical Data Models and understanding of Data Quality Concepts. Write SQL Queries to pull/fetch data from systems/DWH Understanding of Data WareHousing concepts Understanding the Data Movement between Source and Target applications and perform data quality checks to maintain the data integrity, accuracy and consistency Experience in analysis/reconciliation of data as per the business requirements Conduct research and Analysis in order to come up with solution to business problems Understanding requirements directly from clients/ client stakeholders and writing code to extract relevant data and produce report
Experience Required
10-12 Years
Roles & Responsibilities
Interpret data, analyze results using Data Analysis techniques and
provide ongoing reports
Develop and implement databases, data repositories for performing analysis Acquire data from primary or secondary data sources and maintain databases/data repositories Identify, analyze, and interpret trends or patterns in complex data sets Filter and “clean” data by reviewing computer reports, printouts, and performance indicators to locate and correct code problems ; Work with management to prioritize business and information needs Locate and define new process improvement opportunities Good exposure and hands on exp with Excel features used for data analysis & reporting
- source_sentence: >-
Senior Data Scientist, Statistical Analysis, Data Interpretation, TS/SCI
Clearance
sentences:
- >-
Skills :8+ years of relevant experienceExperience with big data
technology(s) or ecosystem in Hadoop, HDFS (also an understanding of
HDFS Architecture), Hive, Map Reduce, Base - this is considering all of
AMP datasets are in HDFS/S3Advanced SQL and SQL performance tuningStrong
experience in Spark and Scala
- >-
experience, regulatory compliance & operational efficiencies, enabled by
Google Cloud.
This position will lead integration of core data from New North America
Lending platforms into Data Factory (GCP BQ), and build upon the
existing analytical data, including merging historical data from legacy
platforms with data ingested from new platforms. To enable critical
regulatory reporting, operational analytics, risk analytics and modeling
Will provide overall technical guidance to implementation teams and
oversee adherence to engineering patterns and data quality and
compliance standards, across all data factory workstreams. Support
business adoption of data from new platform and sunset of legacy
platforms & technology stack.
This position will collaborate with technical program manager, data
platform enablement manager, analytical data domain leaders, subject
matter experts, supplier partners, business partner and IT operations
teams to deliver the Data integration workstream plan following agile
framework.
Responsibilities
We are looking for dynamic, technical leader with prior experience of
leading data warehouse as part of complex business & tech
transformation. Has strong experience in Data Engineering, GCP Big
Query, Data ETL pipelines, Data architecture, Data Governance, Data
protection, security & compliance, and user access enablement.
Key responsibilities -
This role will focus on implementing data integration of new lending
platform into Google Cloud Data Platform (Data factory), existing
analytical domains and building new data marts, while ensuring new data
is integrated seamlessly with historical data. Will lead a dedicated
team of data engineers & analysts to understand and assess new data
model and attributes, in upstream systems, and build an approach to
integrate this data into factory.Will lead the data integration
architecture (in collaboration with core mod platform & data factory
architects) and designs, and solution approach for Data FactoryWill
understand the scope of reporting for MMP (Minimal Marketable Product)
launch & build the data marts required to enable agreed use cases for
regulatory, analytical & operational reporting, and data required for
Risk modeling. Will collaborate with Data Factory Analytical domain
teams, to build new pipelines & expansion of analytical domains. Will
lead data integration testing strategy & its execution within Data
Factory (end-to-end, from ingestion, to analytical domains, to marts) to
support use cases.Will be Data Factory SPOC for all Core Modernization
program and help facilitate & prioritize backlogs of data
workstreams.Ensure the data solutions are aligned to overall program
goals, timing and are delivered with qualityCollaborate with program
managers to plan iterations, backlogs and dependencies across all
workstream to progress workstreams at required pace.Drive adoption of
standardized architecture, design and quality assurance approaches
across all workstreams and ensure solutions adheres to established
standards.People leader for a team of 5+ data engineers and analysts.
Additionally manage supplier partner team who will execute the migration
planLead communication of status, issues & risks to key stakeholders
Qualifications
You'll have…..
Bachelor’s degree in computer science or equivalent5+ years of
experience delivering complex Data warehousing projects and leading
teams of 10+ engineers and suppliers to build Big Data/Datawarehouse
solutions.10+ years of experience in technical delivery of Data
Warehouse Cloud Solutions for large companies, and business adoption of
these platforms to build analytics , insights & modelsPrior experience
with cloud data architecture, data modelling principles, DevOps,
security and controls Google Cloud certified - Cloud Data Engineer
preferred.Hands on experience of the following:Orchestration of data
pipelines (e.g. Airflow, DBT, Dataform, Astronomer).Batch data pipelines
(e.g. BQ SQL, Dataflow, DTS).Streaming data pipelines (e.g. Kafka,
Pub/Sub, gsutil)Data warehousing techniques (e.g. data modelling,
ETL/ELT).
Even better, you may have….
Master’s degree in- Computer science, Computer engineering, Data science
or related fieldKnowledge of Ford credit business functional, core
systems, data knowledge Experience in technical program management &
delivering complex migration projects.Building high performance
teamsManaging/or working with globally distributed teamsPrior experience
in leveraging offshore development service providers.Experience in a
Fintech or large manufacturing company.Very strong leadership,
communication, organizing and problem-solving skills.Ability to
negotiate with and influence stakeholders & drive forward strategic data
transformation.Quick learner, self-starter, energetic leaders with drive
to deliver results. Empathy and care for customers and teams, as a
leader guide teams on advancement of skills, objective setting and
performance assessments
You may not check every box, or your experience may look a little
different from what we've outlined, but if you think you can bring value
to Ford Motor Company, we encourage you to apply!
As an established global company, we offer the benefit of choice. You
can choose what your Ford future will look like: will your story span
the globe, or keep you close to home? Will your career be a deep dive
into what you love, or a series of new teams and new skills? Will you be
a leader, a changemaker, a technical expert, a culture builder...or all
of the above? No matter what you choose, we offer a work life that works
for you, including:
Immediate medical, dental, and prescription drug coverageFlexible family
care, parental leave, new parent ramp-up programs, subsidized back-up
childcare and moreVehicle discount program for employees and family
members, and management leasesTuition assistanceEstablished and active
employee resource groupsPaid time off for individual and team community
serviceA generous schedule of paid holidays, including the week between
Christmas and New Year's DayPaid time off and the option to purchase
additional vacation time
For a detailed look at our benefits, click here:
2024 New Hire Benefits Summary
Visa sponsorship is not available for this position.
Candidates for positions with Ford Motor Company must be legally
authorized to work in the United States. Verification of employment
eligibility will be required at the time of hire.
We are
- >-
experience to solve some of the most challenging intelligence issues
around data.
Job Responsibilities & Duties
Devise strategies for extracting meaning and value from large datasets.
Make and communicate principled conclusions from data using elements of
mathematics, statistics, computer science, and application specific
knowledge. Through analytic modeling, statistical analysis, programming,
and/or another appropriate scientific method, develop and implement
qualitative and quantitative methods for characterizing, exploring, and
assessing large datasets in various states of organization, cleanliness,
and structure that account for the unique features and limitations
inherent in data holdings. Translate practical needs and analytic
questions related to large datasets into technical requirements and,
conversely, assist others with drawing appropriate conclusions from the
analysis of such data. Effectively communicate complex technical
information to non-technical audiences.
Minimum Qualifications
10 years relevant experience with Bachelors in related field; or 8 years
experience with Masters in related field; or 6 years experience with a
Doctoral degree in a related field; or 12 years of relevant experience
and an Associates may be considered for individuals with in-depth
experienceDegree in an Mathematics, Applied Mathematics, Statistics,
Applied Statistics, Machine Learning, Data Science, Operations Research,
or Computer Science, or related field of technical
rigorAbility/willingness to work full-time onsite in secure government
workspacesNote: A broader range of degrees will be considered if
accompanied by a Certificate in Data Science from an accredited
college/university.
Clearance Requirements
This position requires a TS/SCI with Poly
Looking for other great opportunities? Check out Two Six Technologies
Opportunities for all our Company’s current openings!
Ready to make the first move towards growing your career? If so, check
out the Two Six Technologies Candidate Journey! This will give you
step-by-step directions on applying, what to expect during the
application process, information about our rich benefits and perks along
with our most frequently asked questions. If you are undecided and would
like to learn more about us and how we are contributing to essential
missions, check out our Two Six Technologies News page! We share
information about the tech world around us and how we are making an
impact! Still have questions, no worries! You can reach us at Contact
Two Six Technologies. We are happy to connect and cover the information
needed to assist you in reaching your next career milestone.
Two Six Technologies is
If you are an individual with a disability and would like to request
reasonable workplace accommodation for any part of our employment
process, please send an email to [email protected].
Information provided will be kept confidential and used only to the
extent required to provide needed reasonable accommodations.
Additionally, please be advised that this business uses E-Verify in its
hiring practices.
By submitting the following application, I hereby certify that to the
best of my knowledge, the information provided is true and accurate.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy
model-index:
- name: SentenceTransformer based on sentence-transformers/all-distilroberta-v1
results:
- task:
type: triplet
name: Triplet
dataset:
name: ai job validation
type: ai-job-validation
metrics:
- type: cosine_accuracy
value: 0.9900990128517151
name: Cosine Accuracy
- task:
type: triplet
name: Triplet
dataset:
name: ai job test
type: ai-job-test
metrics:
- type: cosine_accuracy
value: 1
name: Cosine Accuracy
SentenceTransformer based on sentence-transformers/all-distilroberta-v1
This is a sentence-transformers model finetuned from sentence-transformers/all-distilroberta-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-distilroberta-v1
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("krshahvivek/distilroberta-ai-job-embeddings")
# Run inference
sentences = [
'Senior Data Scientist, Statistical Analysis, Data Interpretation, TS/SCI Clearance',
'experience to solve some of the most challenging intelligence issues around data.\n\nJob Responsibilities & Duties\n\nDevise strategies for extracting meaning and value from large datasets. Make and communicate principled conclusions from data using elements of mathematics, statistics, computer science, and application specific knowledge. Through analytic modeling, statistical analysis, programming, and/or another appropriate scientific method, develop and implement qualitative and quantitative methods for characterizing, exploring, and assessing large datasets in various states of organization, cleanliness, and structure that account for the unique features and limitations inherent in data holdings. Translate practical needs and analytic questions related to large datasets into technical requirements and, conversely, assist others with drawing appropriate conclusions from the analysis of such data. Effectively communicate complex technical information to non-technical audiences.\n\nMinimum Qualifications\n\n10 years relevant experience with Bachelors in related field; or 8 years experience with Masters in related field; or 6 years experience with a Doctoral degree in a related field; or 12 years of relevant experience and an Associates may be considered for individuals with in-depth experienceDegree in an Mathematics, Applied Mathematics, Statistics, Applied Statistics, Machine Learning, Data Science, Operations Research, or Computer Science, or related field of technical rigorAbility/willingness to work full-time onsite in secure government workspacesNote: A broader range of degrees will be considered if accompanied by a Certificate in Data Science from an accredited college/university.\n\nClearance Requirements\n\nThis position requires a TS/SCI with Poly\n\nLooking for other great opportunities? Check out Two Six Technologies Opportunities for all our Company’s current openings!\n\nReady to make the first move towards growing your career? If so, check out the Two Six Technologies Candidate Journey! This will give you step-by-step directions on applying, what to expect during the application process, information about our rich benefits and perks along with our most frequently asked questions. If you are undecided and would like to learn more about us and how we are contributing to essential missions, check out our Two Six Technologies News page! We share information about the tech world around us and how we are making an impact! Still have questions, no worries! You can reach us at Contact Two Six Technologies. We are happy to connect and cover the information needed to assist you in reaching your next career milestone.\n\nTwo Six Technologies is \n\nIf you are an individual with a disability and would like to request reasonable workplace accommodation for any part of our employment process, please send an email to [email protected]. Information provided will be kept confidential and used only to the extent required to provide needed reasonable accommodations.\n\nAdditionally, please be advised that this business uses E-Verify in its hiring practices.\n\n\n\nBy submitting the following application, I hereby certify that to the best of my knowledge, the information provided is true and accurate.',
'Skills :8+ years of relevant experienceExperience with big data technology(s) or ecosystem in Hadoop, HDFS (also an understanding of HDFS Architecture), Hive, Map Reduce, Base - this is considering all of AMP datasets are in HDFS/S3Advanced SQL and SQL performance tuningStrong experience in Spark and Scala',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Triplet
- Datasets:
ai-job-validation
andai-job-test
- Evaluated with
TripletEvaluator
Metric | ai-job-validation | ai-job-test |
---|---|---|
cosine_accuracy | 0.9901 | 1.0 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 809 training samples
- Columns:
sentence_0
andsentence_1
- Approximate statistics based on the first 809 samples:
sentence_0 sentence_1 type string string details - min: 8 tokens
- mean: 15.02 tokens
- max: 40 tokens
- min: 7 tokens
- mean: 348.14 tokens
- max: 512 tokens
- Samples:
sentence_0 sentence_1 GCP Data Engineer, BigQuery, Airflow DAG, Hadoop ecosystem
requirements for our direct client, please go through the below Job Description. If you are interested please send me your updated word format resume to [email protected] and reach me @ 520-231-4672.
Title: GCP Data EngineerLocation: Hartford, CTDuration: Full Time
6-8 Years of experience in data extraction and creating data pipeline workflows on Bigdata (Hive, HQL/PySpark) with knowledge of Data Engineering concepts.Experience in analyzing large data sets from multiple data sources, perform validation of data.Knowledge of Hadoop eco-system components like HDFS, Spark, Hive, Sqoop.Experience writing codes in Python.Knowledge of SQL/HQL to write optimized queries.Hands on with GCP Cloud Services such as Big Query, Airflow DAG, Dataflow, Beam etc.Data analysis for legal documents, meticulous data entry, active Top-Secret security clearance
Requirements NOTE: Applicants with an Active TS Clearance preferred Requirements * High School diploma or GED, Undergraduate degree preferred Ability to grasp and understand the organization and functions of the customer Meticulous data entry skills Excellent communication skills; oral and written Competence to review, interpret, and evaluate complex legal and non-legal documents Attention to detail and the ability to read and follow directions is extremely important Strong organizational and prioritization skills Experience with the Microsoft Office suite of applications (Excel, PowerPoint, Word) and other common software applications, to include databases, intermediate skills preferred Proven commitment and competence to provide excellent customer service; positive and flexible Ability to work in a team environment and maintain a professional dispositionThis position requires U.S. Citizenship and a 7 (or 10) year minimum background investigation ** NOTE: The 20% pay differential is d...
Trust & Safety, Generative AI, Recommender Systems
experiences achieve more in their careers. Our vision is to create economic opportunity for every member of the global workforce. Every day our members use our products to make connections, discover opportunities, build skills and gain insights. We believe amazing things happen when we work together in an environment where everyone feels a true sense of belonging, and that what matters most in a candidate is having the skills needed to succeed. It inspires us to invest in our talent and support career growth. Join us to challenge yourself with work that matters.
Location:
At LinkedIn, we trust each other to do our best work where it works best for us and our teams. This role offers a hybrid work option, meaning you can work from home and commute to a LinkedIn office, depending on what’s best for you and when it is important for your team to be together.
This role is based in Sunnyvale, CA.
Team Information:
The mission of the Anti-Abuse AI team is to build trust in every inte... - Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size
: 2per_device_eval_batch_size
: 2num_train_epochs
: 2multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 2per_device_eval_batch_size
: 2per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 2max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | Training Loss | ai-job-validation_cosine_accuracy | ai-job-test_cosine_accuracy |
---|---|---|---|---|
-1 | -1 | - | 0.8812 | - |
1.0 | 405 | - | 0.9901 | - |
1.2346 | 500 | 0.07 | - | - |
2.0 | 810 | - | 0.9901 | - |
-1 | -1 | - | 0.9901 | 1.0 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.4.1
- Transformers: 4.48.3
- PyTorch: 2.6.0+cu124
- Accelerate: 1.3.0
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}