Model Behavior Architect, Alignment Finetuning (Hiring Immediately) Job at Anthropic, San Francisco, CA

ZW1jc0RKWnUwL21QOW5STW1od0R6K3JuWFE9PQ==
  • Anthropic
  • San Francisco, CA

Job Description

Join to apply for the Model Behavior Architect, Alignment Finetuning role at Anthropic

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About The Role

As a Model Behavior Architect at Anthropic, you'll be at the forefront of shaping AI system behavior to ensure it aligns with human values. Working within the Alignment Finetuning team, you'll combine your expertise in model evaluation, prompt engineering, and ethical judgment to help create AI systems that respond with good judgment across diverse scenarios.

Responsibilities

  • Design and implement subtle prompting strategies and data generation pipelines that improve model responses
  • Identify and fix edge case behaviors through rigorous testing of your data generation pipelines
  • Interact with models to carefully identify where model behavior and judgment can be improved
  • Gather internal and external feedback on model behavior to document areas for improvement
  • Develop evaluations of language model behaviors across judgment-based domains like honesty, character, and ethics
  • Work collaboratively with researchers on related teams like Trust and Safety, Alignment Science, and Applied Finetuning

You May Be a Good Fit If You

  • Have extensive experience with prompt engineering and chaining for language models
  • Demonstrate strong skills in evaluating AI system outputs on subtle or fuzzy tasks
  • Have a background in philosophy, psychology, data science, or related fields
  • Care about AI safety and the ethical implications of both current and future AI behaviors
  • Are comfortable using basic Python and running basic scripts
  • Have a keen eye for identifying subtle issues in AI outputs
  • Understand how LLMs are trained and are familiar with concepts in reinforcement learning
  • Have experience finetuning large language models
  • Are happy to engage in test-driven development and to carefully analyze data and data pipelines

Strong Candidates May Also Have

  • Formal training in ethics or moral philosophy or moral psychology
  • Experience in data science with emphasis on data verification
  • Conceptual understanding of language model training and finetuning techniques
  • Previous experience developing evaluation frameworks for large language models
  • Background in AI safety research or similar fields
  • Experience with RLHF, constitutional AI, or other alignment techniques
  • Published work related to AI ethics or safety
  • Knowledge of model behavior benchmarking

Additional Information

Join us in our mission to ensure advanced AI systems behave reliably and ethically while staying aligned with human values.

Salary Range: $280,000 - $425,000 USD

Logistics

Education requirements: Bachelor’s degree in a related field or equivalent experience.

Location policy: Hybrid, with at least 25% in-office presence. Some roles may require more.

Visa sponsorship: Available, with efforts made to assist in visa acquisition upon offer.

We encourage applicants even if they do not meet every qualification. Diversity and inclusion are valued, and we believe diverse perspectives enhance our work.

Why Join Us?

We believe impactful AI research is collaborative and large-scale. We value impact and empirical science, akin to physics or biology, and foster open communication and impactful work.

Our recent research includes GPT-3, interpretability, multimodal neurons, scaling laws, and AI safety.

Join Us!

Anthropic offers competitive compensation, benefits, equity options, generous leave, flexible hours, and a collaborative office environment.

#J-18808-Ljbffr

Job Tags

Full time, Immediate start, Visa sponsorship, Flexible hours,

Similar Jobs

Baptist Health System - San Antonio TX

Registered Nurse (RN) - New Graduate - $25-36 per hour Job at Baptist Health System - San Antonio TX

 ...Baptist Health System - San Antonio TX is seeking a Registered Nurse (RN) New Graduate for a nursing job in San Antonio, Texas. Job Description...  ...as assigned. POSITION QUALIFICATIONS No previous experience needed! Basic Cardiac Life Support (BLS) upon hire or... 

Mathys+Potestio / The Creative Party®

UX Writer Job at Mathys+Potestio / The Creative Party®

UX Writer (Customer Journeys)Product & Growth TeamThis is a 12-month, full-time (40 hours/week), hybrid contract role located in Culver City, CA.SummaryWe are seeking a skilled UX Writer to help shape customer journeys across multiple digital services. This role plays... 

Capstone Logisitcs

Cargo Van Delivery Driver/ Independent Contractors needed in TAMPA Job at Capstone Logisitcs

 ...and distribution center support, last-mile delivery, supply chain analytics, optimization,...  ...warehousing and transportation costs. *****Drivers must use a company provided vehicle****...  ...: Daily Functions: Operate Sprinter Vans for Home Deliveries for a major Grocery retailer... 

C+A Global

Procurement Specialist Job at C+A Global

 ...interpret sales trends, inventory data, and supplier constraints to make informed purchasing decisions. Microsoft Office Proficiency: Strong skills in Excel (pivot tables, formulas, data analysis) and other Office applications. Communication & Collaboration: Ability... 

Fusion ES

Cable Technician (Holocom Cert Required) Job at Fusion ES

 ...Fusion ES is currently seekingcleared and experienced Cable Technician for a project in Jordan ! Please note thatSecret clearanceis required at minimum, we are not sponsoring clearances. See below: ~ Duration: September 3rd October 13th ~ Lodging/Travel Provided...