Hugging Face Transformers 라이브러리의 Auto 클래스

Notice

Recent Posts

Recent Comments

Link

« 2025/03 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Tags more

Archives

관리 메뉴

yuns

Hugging Face Transformers 라이브러리의 Auto 클래스 본문

머신러닝

Hugging Face Transformers 라이브러리의 Auto 클래스

yuuuun 2025. 3. 13. 03:44

Hugging Face의 transformers 라이브러리는 다양한 사전 학습된 모델을 쉽게 불러오고 사용할 수 있도록 Auto 클래스를 제공합니다.
AutoConfig, AutoTokenizer, AutoModel, AutoModelForCausalLM 등의 클래스는 특정 모델을 자동으로 불러오는 역할을 합니다.

클래스 명	역할
AutoConfig	모델의 설정 및 하이퍼파라미터 로드
AutoTokenizer	모델에 맞는 토크나이저 로드
AutoModel	기본 Transformer 모델 로드
AutoModelForCausalLM	텍스트 생성 모델(GPT 계열) 로드

이러한 Auto 클래스를 활용하면 특정 모델에 종속되지 않고 다양한 모델을 손쉽게 변경하여 실험할 수 있습니다. 🚀

1. AutoConfig

AutoConfig는 모델의 설정(하이퍼파라미터)을 자동으로 불러오는 클래스입니다.

🔹 사용 목적

모델의 구조 및 하이퍼파라미터를 설정
모델을 불러올 때, 특정 설정을 수정 가능

🔹 예제 코드

from transformers import AutoConfig # BERT 모델의 설정을 불러오기
config = AutoConfig.from_pretrained("bert-base-uncased")
print(config) # 모델의 설정 출력

🔹결과

BertConfig {
  "_name_or_path": "bert-base-uncased",
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "transformers_version": "4.49.0",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}

🔹 실제 활용 예

모델을 처음부터 재학습할 때(from_scratch)
모델 설정을 사용자 정의할 때

2. AutoTokenizer

AutoTokenizer는 해당 모델에 적합한 토크나이저를 자동으로 불러옵니다.

🔹 사용 목적

텍스트 데이터를 토큰 단위로 변환 (ex: BERT는 WordPiece, GPT는 Byte-Pair Encoding 사용)
tokenizer.encode() 또는 tokenizer.decode()를 사용해 텍스트를 숫자로 변환

🔹 예제 코드

>>> from transformers import AutoTokenizer # GPT-2 모델에 맞는 토크나이저 불러오기 
>>> tokenizer = AutoTokenizer.from_pretrained("gpt2") 
>>> text = "Hello, how are you?" 
>>> tokens = tokenizer(text, return_tensors="pt") # PyTorch 텐서 형태로 변환 
>>> print(tokens)
{'input_ids': tensor([[15496,    11,   703,   389,   345,    30]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1]])}

🔹 실제 활용 예

NLP 모델 입력 전처리
모델의 학습 및 추론 시 토큰 변환

3. AutoModel

AutoModel은 특정 모델 아키텍처를 자동으로 로드하는 클래스입니다.

🔹 사용 목적

BERT, RoBERTa, T5, GPT 등 다양한 모델을 자동으로 불러오기
모델을 PyTorch 또는 TensorFlow 형태로 불러와서 사용 가능

🔹 예제 코드

>>> from transformers import AutoModel
>>> model = AutoModel.from_pretrained("gpt2")
>>> print(model)
GPT2Model(
  (wte): Embedding(50257, 768)
  (wpe): Embedding(1024, 768)
  (drop): Dropout(p=0.1, inplace=False)
  (h): ModuleList(
    (0-11): 12 x GPT2Block(
      (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
      (attn): GPT2Attention(
        (c_attn): Conv1D(nf=2304, nx=768)
        (c_proj): Conv1D(nf=768, nx=768)
        (attn_dropout): Dropout(p=0.1, inplace=False)
        (resid_dropout): Dropout(p=0.1, inplace=False)
      )
      (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
      (mlp): GPT2MLP(
        (c_fc): Conv1D(nf=3072, nx=768)
        (c_proj): Conv1D(nf=768, nx=3072)
        (act): NewGELUActivation()
        (dropout): Dropout(p=0.1, inplace=False)
      )
    )
  )
  (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
)

🔹 실제 활용 예

사전 학습된 모델을 로드하여 추가 학습(fine-tuning)
문장 임베딩, 텍스트 분류, 개체명 인식 등의 NLP 작업 수행

4. AutoModelForCausalLM

AutoModelForCausalLM은 텍스트 생성(Causal Language Modeling) 모델을 자동으로 불러옵니다.

🔹 사용 목적

GPT 계열 모델(GPT-2, GPT-3, LLaMA 등)처럼 텍스트를 생성하는 모델을 로드할 때 사용
이전 단어를 기반으로 다음 단어를 예측하는 모델

🔹 예제 코드

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# GPT-2 모델과 토크나이저 불러오기
model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")

# 입력 문장
input_text = "Once upon a time"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids

# 다음 단어 예측 (텍스트 생성)
output = model.generate(input_ids, max_length=50)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print('output: ', output)
print('generated text: ', generated_text)

🔹 결과

output:  tensor([[7454, 2402,  257,  640,   11,  262,  995,  373,  257, 1295,  286, 1049,
         8737,  290, 1049, 3514,   13,  383,  995,  373,  257, 1295,  286, 1049,
         3514,   11,  290,  262,  995,  373,  257, 1295,  286, 1049, 3514,   13,
          383,  995,  373,  257, 1295,  286, 1049, 3514,   11,  290,  262,  995,
          373,  257]])
generated text:  Once upon a time, the world was a place of great beauty and great danger. The world was a place of great danger, and the world was a place of great danger. The world was a place of great danger, and the world was a

🔹 실제 활용 예

챗봇, 스토리 생성, 자동 문장 완성
GPT 기반의 생성 모델 활용

5. 기타 AutoModel 계열 클래스

클래스명	사용목적	모델예시
AutoModelForSequenceClassification	문장 분류	BERT, RoBERTa, ALBERT
AutoModelForTokenClassification	개체명 인식(NER)	BERT, XLM-RoBERTa
AutoModelForQuestionAnswering	질문-응답 모델	BERT, T5
AutoModelForSeq2SeqLM	번역, 요약 모델	T5, BART, mBART

'머신러닝' 카테고리의 다른 글

Attention is all you need - Transformer (0)	2025.03.12
3. 머신러닝의 주요 알고리즘(1) - 선형 회귀(Linear Regression) (0)	2025.03.12
Inforamtion Retrieval 정보 검색이란 (0)	2025.03.12
Locality Sensitive Hashing(LSH)이란? (0)	2025.03.12
2. 머신러닝의 기본 개념(3) - 모델, 데이터셋, 학습, 검증 등 (0)	2025.03.11

'머신러닝' Related Articles

Comments

yuns

Hugging Face Transformers 라이브러리의 Auto 클래스 본문

Hugging Face Transformers 라이브러리의 Auto 클래스

1. AutoConfig

2. AutoTokenizer

3. AutoModel

4. AutoModelForCausalLM

5. 기타 AutoModel 계열 클래스

'머신러닝' 카테고리의 다른 글

티스토리툴바