Model Behavior Architect, Alignment Finetuning (Hiring Immediately) Job at Anthropic, San Francisco, CA

ZW1jc0RKWnUwL21QOW5STW1od0R6K3JuWFE9PQ==
  • Anthropic
  • San Francisco, CA

Job Description

Join to apply for the Model Behavior Architect, Alignment Finetuning role at Anthropic

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About The Role

As a Model Behavior Architect at Anthropic, you'll be at the forefront of shaping AI system behavior to ensure it aligns with human values. Working within the Alignment Finetuning team, you'll combine your expertise in model evaluation, prompt engineering, and ethical judgment to help create AI systems that respond with good judgment across diverse scenarios.

Responsibilities

  • Design and implement subtle prompting strategies and data generation pipelines that improve model responses
  • Identify and fix edge case behaviors through rigorous testing of your data generation pipelines
  • Interact with models to carefully identify where model behavior and judgment can be improved
  • Gather internal and external feedback on model behavior to document areas for improvement
  • Develop evaluations of language model behaviors across judgment-based domains like honesty, character, and ethics
  • Work collaboratively with researchers on related teams like Trust and Safety, Alignment Science, and Applied Finetuning

You May Be a Good Fit If You

  • Have extensive experience with prompt engineering and chaining for language models
  • Demonstrate strong skills in evaluating AI system outputs on subtle or fuzzy tasks
  • Have a background in philosophy, psychology, data science, or related fields
  • Care about AI safety and the ethical implications of both current and future AI behaviors
  • Are comfortable using basic Python and running basic scripts
  • Have a keen eye for identifying subtle issues in AI outputs
  • Understand how LLMs are trained and are familiar with concepts in reinforcement learning
  • Have experience finetuning large language models
  • Are happy to engage in test-driven development and to carefully analyze data and data pipelines

Strong Candidates May Also Have

  • Formal training in ethics or moral philosophy or moral psychology
  • Experience in data science with emphasis on data verification
  • Conceptual understanding of language model training and finetuning techniques
  • Previous experience developing evaluation frameworks for large language models
  • Background in AI safety research or similar fields
  • Experience with RLHF, constitutional AI, or other alignment techniques
  • Published work related to AI ethics or safety
  • Knowledge of model behavior benchmarking

Additional Information

Join us in our mission to ensure advanced AI systems behave reliably and ethically while staying aligned with human values.

Salary Range: $280,000 - $425,000 USD

Logistics

Education requirements: Bachelor’s degree in a related field or equivalent experience.

Location policy: Hybrid, with at least 25% in-office presence. Some roles may require more.

Visa sponsorship: Available, with efforts made to assist in visa acquisition upon offer.

We encourage applicants even if they do not meet every qualification. Diversity and inclusion are valued, and we believe diverse perspectives enhance our work.

Why Join Us?

We believe impactful AI research is collaborative and large-scale. We value impact and empirical science, akin to physics or biology, and foster open communication and impactful work.

Our recent research includes GPT-3, interpretability, multimodal neurons, scaling laws, and AI safety.

Join Us!

Anthropic offers competitive compensation, benefits, equity options, generous leave, flexible hours, and a collaborative office environment.

#J-18808-Ljbffr

Job Tags

Full time, Immediate start, Visa sponsorship, Flexible hours,

Similar Jobs

Perdue Chicken

Lead Food Sanitation & Compliance Manager Job at Perdue Chicken

 ...A leading food processing company in Accomac, VA is seeking a Sanitation Manager to oversee sanitation programs and compliance with regulatory standards. The ideal candidate will have over 10 years of experience in food processing and a Bachelor's degree in a related... 

US Oncology Inc.

Telephone Triage Nurse Job at US Oncology Inc.

 ...the highest quality, compassionate care. Our 300+ employees make up the teams at our 17 different East Tennessee locations!As a Triage Nurse (RN), you will provide appropriate medical advice over the phone to our patients according to treatment plans ordered by our... 

Promocentric

Accounts Receivable Specialist Job at Promocentric

 ...passion for keeping things balanced and ensuring on-time payments? Were on the hunt for a sharp, energetic Accounts...  ...suggest improvements to our credit strategy as we grow. Cash Applications Apply daily payments with accuracy and speed every dollar in the... 

Jobconversion, LLC

Administrative Assistant / Data Entry Clerk (Remote Work From Home - Online) Job at Jobconversion, LLC

 ...great at data entry and typing. We offer a flexible work from home remote position that allows you to stay home with the family! The...  ...assistant, warehouse, inventory, receptionist, call center, part-time, retail fields & more. Requirements Stable Internet connection... 

Sysco

Warehouse Order Selector - Hiring Immediately Job at Sysco

 ...Warehouse Order Selectors up to $75k/year NO EXPERIENCE REQUIRED / 3rd shift Hiring immediately Earn up to $75k per year including...  ...required! Overtime opportunities JOB SUMMARY Work in a Sysco warehouse and be a critical member of the foodservice...