Classification and Subject Access, Information Organization

AI/IA

Even more guidance on labeling and few-shot examples and guidance on output format:

Here's is a set of tweets about a clothing company. Tag each one by sentiment on a scale of 1-5. 1 is negative, 3 is neutral, 5 is positive.

# Coding Guidelines

## 1. Very Negative (1)
- Strong negative emotions (e.g., anger, frustration, sadness)
- Harsh criticism or complaints about UrbanThreads
- Use of negative words (e.g., "hate," "terrible," "awful")
- Negative experiences with UrbanThreads products or services

Example: "I absolutely hate UrbanThreads! Their clothes fell apart after one wash. Worst quality ever, never shopping there again!  #UrbanThreadsFail"

... {and so on}


# Format

- [{score}] - {tweet}

# Tweets to Annotate

{tweets}

Classification and Subject Access, Information Organization

Today

Skill Check

Key Concepts

Structured vs Unstructured Data

Supervised vs Unsupervised Learning

Classification

Information Extraction

Some Use Cases

Classification

Some Use Cases

Information Extraction

The falling burden of training data

Ad-hoc classification and few-shot expansion

Results

Baseline

LLM Prompted (c.2022)

Results (w/GPT-4 & Fine-Tuning)

Baseline

LLM Prompted (adding GPT-4)

LLM Fine-Tuned

How to do classification and information extraction

Easiest Way: Just Ask!

Questions to consider:

Activity: Basic Sentiment Classification

Example: Tweets

How to use it: Better

Temperature and Stochasticity

Structured Data Outputs

How to use it: Better-er

How to use it: Best

Fine-tuning!

Coda: How do you get the data out?

Coda: Data Analysis

Summary

Lab: Classification Prompt Battle