Transformers documentation

๐Ÿค— Transformers๋กœ ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ

You are viewing v4.34.0 version. A newer version v4.47.1 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

๐Ÿค— Transformers๋กœ ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ

๐Ÿค— Transformers๋Š” ์ž์—ฐ์–ด์ฒ˜๋ฆฌ(NLP), ์ปดํ“จํ„ฐ ๋น„์ „, ์˜ค๋””์˜ค ๋ฐ ์Œ์„ฑ ์ฒ˜๋ฆฌ ์ž‘์—…์— ๋Œ€ํ•œ ์‚ฌ์ „ํ›ˆ๋ จ๋œ ์ตœ์ฒจ๋‹จ ๋ชจ๋ธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ž…๋‹ˆ๋‹ค. ์ด ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” ํŠธ๋žœ์Šคํฌ๋จธ ๋ชจ๋ธ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ปดํ“จํ„ฐ ๋น„์ „ ์ž‘์—…์„ ์œ„ํ•œ ํ˜„๋Œ€์ ์ธ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง๊ณผ ๊ฐ™์€ ํŠธ๋žœ์Šคํฌ๋จธ๊ฐ€ ์•„๋‹Œ ๋ชจ๋ธ๋„ ํฌํ•จํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์Šค๋งˆํŠธํฐ, ์•ฑ, ํ…”๋ ˆ๋น„์ „๊ณผ ๊ฐ™์€ ์˜ค๋Š˜๋‚  ๊ฐ€์žฅ ์ธ๊ธฐ ์žˆ๋Š” ์†Œ๋น„์ž ์ œํ’ˆ์„ ์‚ดํŽด๋ณด๋ฉด, ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ์ˆ ์ด ๊ทธ ๋’ค์— ์‚ฌ์šฉ๋˜๊ณ  ์žˆ์„ ํ™•๋ฅ ์ด ๋†’์Šต๋‹ˆ๋‹ค. ์Šค๋งˆํŠธํฐ์œผ๋กœ ์ดฌ์˜ํ•œ ์‚ฌ์ง„์—์„œ ๋ฐฐ๊ฒฝ ๊ฐ์ฒด๋ฅผ ์ œ๊ฑฐํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด ์–ด๋–ป๊ฒŒ ํ• ๊นŒ์š”? ์ด๋Š” ํŒŒ๋†‰ํ‹ฑ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ์ž‘์—…์˜ ์˜ˆ์ž…๋‹ˆ๋‹ค(์•„์ง ์ด๊ฒŒ ๋ฌด์—‡์ธ์ง€ ๋ชจ๋ฅธ๋‹ค๋ฉด, ๋‹ค์Œ ์„น์…˜์—์„œ ์„ค๋ช…ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค!).

์ด ํŽ˜์ด์ง€๋Š” ๋‹ค์–‘ํ•œ ์Œ์„ฑ ๋ฐ ์˜ค๋””์˜ค, ์ปดํ“จํ„ฐ ๋น„์ „, NLP ์ž‘์—…์„ ๐Ÿค— Transformers ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋‹ค๋ฃจ๋Š” ๊ฐ„๋‹จํ•œ ์˜ˆ์ œ๋ฅผ 3์ค„์˜ ์ฝ”๋“œ๋กœ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

์˜ค๋””์˜ค

์Œ์„ฑ ๋ฐ ์˜ค๋””์˜ค ์ฒ˜๋ฆฌ ์ž‘์—…์€ ๋‹ค๋ฅธ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ์™€ ์•ฝ๊ฐ„ ๋‹ค๋ฆ…๋‹ˆ๋‹ค. ์ด๋Š” ์ฃผ๋กœ ์˜ค๋””์˜ค๊ฐ€ ์—ฐ์†์ ์ธ ์‹ ํ˜ธ๋กœ ์ž…๋ ฅ๋˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ํ…์ŠคํŠธ์™€ ๋‹ฌ๋ฆฌ ์›๋ณธ ์˜ค๋””์˜ค ํŒŒํ˜•(waveform)์€ ๋ฌธ์žฅ์ด ๋‹จ์–ด๋กœ ๋‚˜๋ˆ ์ง€๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๊น”๋”ํ•˜๊ฒŒ ์ด์‚ฐ์ ์ธ ๋ฌถ์Œ์œผ๋กœ ๋‚˜๋ˆŒ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด ์›๋ณธ ์˜ค๋””์˜ค ์‹ ํ˜ธ๋Š” ์ผ์ •ํ•œ ๊ฐ„๊ฒฉ์œผ๋กœ ์ƒ˜ํ”Œ๋ง๋ฉ๋‹ˆ๋‹ค. ํ•ด๋‹น ๊ฐ„๊ฒฉ ๋‚ด์—์„œ ๋” ๋งŽ์€ ์ƒ˜ํ”Œ์„ ์ทจํ•  ๊ฒฝ์šฐ ์ƒ˜ํ”Œ๋ง๋ฅ ์ด ๋†’์•„์ง€๋ฉฐ, ์˜ค๋””์˜ค๋Š” ์›๋ณธ ์˜ค๋””์˜ค ์†Œ์Šค์— ๋” ๊ฐ€๊นŒ์›Œ์ง‘๋‹ˆ๋‹ค.

๊ณผ๊ฑฐ์˜ ์ ‘๊ทผ ๋ฐฉ์‹์€ ์˜ค๋””์˜ค์—์„œ ์œ ์šฉํ•œ ํŠน์ง•์„ ์ถ”์ถœํ•˜๊ธฐ ์œ„ํ•ด ์˜ค๋””์˜ค๋ฅผ ์ „์ฒ˜๋ฆฌํ•˜๋Š” ๊ฒƒ์ด์—ˆ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ํ˜„์žฌ๋Š” ์›๋ณธ ์˜ค๋””์˜ค ํŒŒํ˜•์„ ํŠน์„ฑ ์ธ์ฝ”๋”์— ์ง์ ‘ ๋„ฃ์–ด์„œ ์˜ค๋””์˜ค ํ‘œํ˜„(representation)์„ ์ถ”์ถœํ•˜๋Š” ๊ฒƒ์ด ๋” ์ผ๋ฐ˜์ ์ž…๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ์ „์ฒ˜๋ฆฌ ๋‹จ๊ณ„๊ฐ€ ๋‹จ์ˆœํ•ด์ง€๊ณ  ๋ชจ๋ธ์ด ๊ฐ€์žฅ ์ค‘์š”ํ•œ ํŠน์ง•์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์˜ค๋””์˜ค ๋ถ„๋ฅ˜

์˜ค๋””์˜ค ๋ถ„๋ฅ˜๋Š” ์˜ค๋””์˜ค ๋ฐ์ดํ„ฐ์— ๋ฏธ๋ฆฌ ์ •์˜๋œ ํด๋ž˜์Šค ์ง‘ํ•ฉ์˜ ๋ ˆ์ด๋ธ”์„ ์ง€์ •ํ•˜๋Š” ์ž‘์—…์ž…๋‹ˆ๋‹ค. ์ด๋Š” ๋งŽ์€ ๊ตฌ์ฒด์ ์ธ ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ์„ ํฌํ•จํ•œ ๋„“์€ ๋ฒ”์ฃผ์ž…๋‹ˆ๋‹ค.

์ผ๋ถ€ ์˜ˆ์‹œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

  • ์Œํ–ฅ ์žฅ๋ฉด ๋ถ„๋ฅ˜: ์˜ค๋””์˜ค์— ์žฅ๋ฉด ๋ ˆ์ด๋ธ”(โ€œ์‚ฌ๋ฌด์‹คโ€, โ€œํ•ด๋ณ€โ€, โ€œ๊ฒฝ๊ธฐ์žฅโ€)์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
  • ์Œํ–ฅ ์ด๋ฒคํŠธ ๊ฐ์ง€: ์˜ค๋””์˜ค์— ์†Œ๋ฆฌ ์ด๋ฒคํŠธ ๋ ˆ์ด๋ธ”(โ€œ์ฐจ ๊ฒฝ์ โ€, โ€œ๊ณ ๋ž˜ ์šธ์Œ์†Œ๋ฆฌโ€, โ€œ์œ ๋ฆฌ ํŒŒ์†โ€)์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
  • ํƒœ๊น…: ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ์†Œ๋ฆฌ(์ƒˆ ์ง€์ €๊ท, ํšŒ์˜์—์„œ์˜ ํ™”์ž ์‹๋ณ„)๊ฐ€ ํฌํ•จ๋œ ์˜ค๋””์˜ค์— ๋ ˆ์ด๋ธ”์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
  • ์Œ์•… ๋ถ„๋ฅ˜: ์Œ์•…์— ์žฅ๋ฅด ๋ ˆ์ด๋ธ”(โ€œ๋ฉ”ํƒˆโ€, โ€œํž™ํ•ฉโ€, โ€œ์ปจํŠธ๋ฆฌโ€)์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
>>> from transformers import pipeline

>>> classifier = pipeline(task="audio-classification", model="superb/hubert-base-superb-er")
>>> preds = classifier("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")
>>> preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
>>> preds
[{'score': 0.4532, 'label': 'hap'},
 {'score': 0.3622, 'label': 'sad'},
 {'score': 0.0943, 'label': 'neu'},
 {'score': 0.0903, 'label': 'ang'}]

์ž๋™ ์Œ์„ฑ ์ธ์‹

์ž๋™ ์Œ์„ฑ ์ธ์‹(ASR)์€ ์Œ์„ฑ์„ ํ…์ŠคํŠธ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์ž‘์—…์ž…๋‹ˆ๋‹ค. ์Œ์„ฑ์€ ์ธ๊ฐ„์˜ ์ž์—ฐ์Šค๋Ÿฌ์šด ์˜์‚ฌ์†Œํ†ต ํ˜•ํƒœ์ด๊ธฐ ๋•Œ๋ฌธ์— ASR์€ ๊ฐ€์žฅ ์ผ๋ฐ˜์ ์ธ ์˜ค๋””์˜ค ์ž‘์—… ์ค‘ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค. ์˜ค๋Š˜๋‚  ASR ์‹œ์Šคํ…œ์€ ์Šคํ”ผ์ปค, ์ „ํ™” ๋ฐ ์ž๋™์ฐจ์™€ ๊ฐ™์€ โ€œ์Šค๋งˆํŠธโ€ ๊ธฐ์ˆ  ์ œํ’ˆ์— ๋‚ด์žฅ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ๊ฐ€์ƒ ๋น„์„œ์—๊ฒŒ ์Œ์•… ์žฌ์ƒ, ์•Œ๋ฆผ ์„ค์ • ๋ฐ ๋‚ ์”จ ์ •๋ณด๋ฅผ ์š”์ฒญํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ ํŠธ๋žœ์Šคํฌ๋จธ ์•„ํ‚คํ…์ฒ˜๊ฐ€ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ค€ ํ•ต์‹ฌ ๋„์ „ ๊ณผ์ œ ์ค‘ ํ•˜๋‚˜๋Š” ์–‘์ด ๋ฐ์ดํ„ฐ ์–‘์ด ์ ์€ ์–ธ์–ด(low-resource language)์— ๋Œ€ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋Œ€๋Ÿ‰์˜ ์Œ์„ฑ ๋ฐ์ดํ„ฐ๋กœ ์‚ฌ์ „ ํ›ˆ๋ จํ•œ ํ›„ ๋ฐ์ดํ„ฐ ์–‘์ด ์ ์€ ์–ธ์–ด์—์„œ ๋ ˆ์ด๋ธ”์ด ์ง€์ •๋œ ์Œ์„ฑ ๋ฐ์ดํ„ฐ 1์‹œ๊ฐ„๋งŒ์œผ๋กœ ๋ชจ๋ธ์„ ๋ฏธ์„ธ ์กฐ์ •ํ•˜๋ฉด ์ด์ „์˜ 100๋ฐฐ ๋งŽ์€ ๋ ˆ์ด๋ธ”์ด ์ง€์ •๋œ ๋ฐ์ดํ„ฐ๋กœ ํ›ˆ๋ จ๋œ ASR ์‹œ์Šคํ…œ๋ณด๋‹ค ํ›จ์”ฌ ๋” ๋†’์€ ํ’ˆ์งˆ์˜ ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

>>> from transformers import pipeline

>>> transcriber = pipeline(task="automatic-speech-recognition", model="openai/whisper-small")
>>> transcriber("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")
{'text': ' I have a dream that one day this nation will rise up and live out the true meaning of its creed.'}

์ปดํ“จํ„ฐ ๋น„์ „

์ปดํ“จํ„ฐ ๋น„์ „ ์ž‘์—… ์ค‘ ๊ฐ€์žฅ ์ดˆ๊ธฐ์˜ ์„ฑ๊ณต์ ์ธ ์ž‘์—… ์ค‘ ํ•˜๋‚˜๋Š” ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง(CNN)์„ ์‚ฌ์šฉํ•˜์—ฌ ์šฐํŽธ๋ฒˆํ˜ธ ์ˆซ์ž ์ด๋ฏธ์ง€๋ฅผ ์ธ์‹ํ•˜๋Š” ๊ฒƒ์ด์—ˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€๋Š” ํ”ฝ์…€๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์œผ๋ฉฐ ๊ฐ ํ”ฝ์…€์€ ์ˆซ์ž ๊ฐ’์œผ๋กœ ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค. ์ด๋กœ์จ ์ด๋ฏธ์ง€๋ฅผ ํ”ฝ์…€ ๊ฐ’์˜ ํ–‰๋ ฌ๋กœ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฒƒ์ด ์‰ฌ์›Œ์ง‘๋‹ˆ๋‹ค. ํŠน์ •ํ•œ ํ”ฝ์…€ ๊ฐ’์˜ ์กฐํ•ฉ์€ ์ด๋ฏธ์ง€์˜ ์ƒ‰์ƒ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

์ปดํ“จํ„ฐ ๋น„์ „ ์ž‘์—…์€ ์ผ๋ฐ˜์ ์œผ๋กœ ๋‹ค์Œ ๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์œผ๋กœ ์ ‘๊ทผ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค:

  1. ํ•ฉ์„ฑ๊ณฑ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€์˜ ๋‚ฎ์€ ์ˆ˜์ค€ ํŠน์ง•์—์„œ ๋†’์€ ์ˆ˜์ค€์˜ ์ถ”์ƒ์ ์ธ ์š”์†Œ๊นŒ์ง€ ๊ณ„์ธต์ ์œผ๋กœ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.

  2. ์ด๋ฏธ์ง€๋ฅผ ํŒจ์น˜๋กœ ๋‚˜๋ˆ„๊ณ  ํŠธ๋žœ์Šคํฌ๋จธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ ์ง„์ ์œผ๋กœ ๊ฐ ์ด๋ฏธ์ง€ ํŒจ์น˜๊ฐ€ ์„œ๋กœ ์–ด๋– ํ•œ ๋ฐฉ์‹์œผ๋กœ ์—ฐ๊ด€๋˜์–ด ์ด๋ฏธ์ง€๋ฅผ ํ˜•์„ฑํ•˜๋Š”์ง€ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. CNN์—์„œ ์„ ํ˜ธํ•˜๋Š” ์ƒํ–ฅ์‹ ์ ‘๊ทผ๋ฒ•๊ณผ๋Š” ๋‹ฌ๋ฆฌ, ์ด ๋ฐฉ์‹์€ ํ๋ฆฟํ•œ ์ด๋ฏธ์ง€๋กœ ์ดˆ์•ˆ์„ ๊ทธ๋ฆฌ๊ณ  ์ ์ง„์ ์œผ๋กœ ์„ ๋ช…ํ•œ ์ด๋ฏธ์ง€๋กœ ๋งŒ๋“ค์–ด๊ฐ€๋Š” ๊ฒƒ๊ณผ ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜

์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜๋Š” ํ•œ ๊ฐœ์˜ ์ „์ฒด ์ด๋ฏธ์ง€์— ๋ฏธ๋ฆฌ ์ •์˜๋œ ํด๋ž˜์Šค ์ง‘ํ•ฉ์˜ ๋ ˆ์ด๋ธ”์„ ์ง€์ •ํ•˜๋Š” ์ž‘์—…์ž…๋‹ˆ๋‹ค.

๋Œ€๋ถ€๋ถ„์˜ ๋ถ„๋ฅ˜ ์ž‘์—…๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ, ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜์—๋Š” ๋‹ค์–‘ํ•œ ์‹ค์šฉ์ ์ธ ์šฉ๋„๊ฐ€ ์žˆ์œผ๋ฉฐ, ์ผ๋ถ€ ์˜ˆ์‹œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

  • ์˜๋ฃŒ: ์งˆ๋ณ‘์„ ๊ฐ์ง€ํ•˜๊ฑฐ๋‚˜ ํ™˜์ž ๊ฑด๊ฐ•์„ ๋ชจ๋‹ˆํ„ฐ๋งํ•˜๊ธฐ ์œ„ํ•ด ์˜๋ฃŒ ์ด๋ฏธ์ง€์— ๋ ˆ์ด๋ธ”์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
  • ํ™˜๊ฒฝ: ์œ„์„ฑ ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜ํ•˜์—ฌ ์‚ฐ๋ฆผ ๋ฒŒ์ฑ„๋ฅผ ๊ฐ์‹œํ•˜๊ณ  ์•ผ์ƒ ์ง€์—ญ ๊ด€๋ฆฌ๋ฅผ ์œ„ํ•œ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•˜๊ฑฐ๋‚˜ ์‚ฐ๋ถˆ์„ ๊ฐ์ง€ํ•ฉ๋‹ˆ๋‹ค.
  • ๋†์—…: ์ž‘๋ฌผ ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜ํ•˜์—ฌ ์‹๋ฌผ ๊ฑด๊ฐ•์„ ํ™•์ธํ•˜๊ฑฐ๋‚˜ ์œ„์„ฑ ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜ํ•˜์—ฌ ํ† ์ง€ ์ด์šฉ ๊ด€์ฐฐ์— ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • ์ƒํƒœํ•™: ๋™๋ฌผ์ด๋‚˜ ์‹๋ฌผ ์ข… ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜ํ•˜์—ฌ ์•ผ์ƒ ๋™๋ฌผ ๊ฐœ์ฒด๊ตฐ์„ ์กฐ์‚ฌํ•˜๊ฑฐ๋‚˜ ๋ฉธ์ข… ์œ„๊ธฐ์— ์ฒ˜ํ•œ ์ข…์„ ์ถ”์ ํ•ฉ๋‹ˆ๋‹ค.
>>> from transformers import pipeline

>>> classifier = pipeline(task="image-classification")
>>> preds = classifier(
...     "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
... )
>>> preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
>>> print(*preds, sep="\n")
{'score': 0.4335, 'label': 'lynx, catamount'}
{'score': 0.0348, 'label': 'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor'}
{'score': 0.0324, 'label': 'snow leopard, ounce, Panthera uncia'}
{'score': 0.0239, 'label': 'Egyptian cat'}
{'score': 0.0229, 'label': 'tiger cat'}

๊ฐ์ฒด ํƒ์ง€

์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜์™€ ๋‹ฌ๋ฆฌ ๊ฐ์ฒด ํƒ์ง€๋Š” ์ด๋ฏธ์ง€ ๋‚ด์—์„œ ์—ฌ๋Ÿฌ ๊ฐ์ฒด๋ฅผ ์‹๋ณ„ํ•˜๊ณ  ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค๋กœ ์ •์˜๋œ ๊ฐ์ฒด์˜ ์œ„์น˜๋ฅผ ํŒŒ์•…ํ•ฉ๋‹ˆ๋‹ค.

๊ฐ์ฒด ํƒ์ง€์˜ ๋ช‡ ๊ฐ€์ง€ ์‘์šฉ ์˜ˆ์‹œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

  • ์ž์œจ ์ฃผํ–‰ ์ฐจ๋Ÿ‰: ๋‹ค๋ฅธ ์ฐจ๋Ÿ‰, ๋ณดํ–‰์ž ๋ฐ ์‹ ํ˜ธ๋“ฑ๊ณผ ๊ฐ™์€ ์ผ์ƒ์ ์ธ ๊ตํ†ต ๊ฐ์ฒด๋ฅผ ๊ฐ์ง€ํ•ฉ๋‹ˆ๋‹ค.
  • ์›๊ฒฉ ๊ฐ์ง€: ์žฌ๋‚œ ๋ชจ๋‹ˆํ„ฐ๋ง, ๋„์‹œ ๊ณ„ํš ๋ฐ ๊ธฐ์ƒ ์˜ˆ์ธก ๋“ฑ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
  • ๊ฒฐํ•จ ํƒ์ง€: ๊ฑด๋ฌผ์˜ ๊ท ์—ด์ด๋‚˜ ๊ตฌ์กฐ์  ์†์ƒ, ์ œ์กฐ ๊ฒฐํ•จ ๋“ฑ์„ ํƒ์ง€ํ•ฉ๋‹ˆ๋‹ค.
>>> from transformers import pipeline

>>> detector = pipeline(task="object-detection")
>>> preds = detector(
...     "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
... )
>>> preds = [{"score": round(pred["score"], 4), "label": pred["label"], "box": pred["box"]} for pred in preds]
>>> preds
[{'score': 0.9865,
  'label': 'cat',
  'box': {'xmin': 178, 'ymin': 154, 'xmax': 882, 'ymax': 598}}]

์ด๋ฏธ์ง€ ๋ถ„ํ• 

์ด๋ฏธ์ง€ ๋ถ„ํ• ์€ ํ”ฝ์…€ ์ฐจ์›์˜ ์ž‘์—…์œผ๋กœ, ์ด๋ฏธ์ง€ ๋‚ด์˜ ๋ชจ๋“  ํ”ฝ์…€์„ ํด๋ž˜์Šค์— ํ• ๋‹นํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๊ฐ์ฒด ํƒ์ง€์™€ ๋‹ค๋ฆ…๋‹ˆ๋‹ค. ๊ฐ์ฒด ํƒ์ง€๋Š” ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€ ๋‚ด์˜ ๊ฐ์ฒด๋ฅผ ๋ ˆ์ด๋ธ”๋งํ•˜๊ณ  ์˜ˆ์ธกํ•˜๋Š” ๋ฐ˜๋ฉด, ๋ถ„ํ• ์€ ๋” ์„ธ๋ถ„ํ™”๋œ ์ž‘์—…์ž…๋‹ˆ๋‹ค. ๋ถ„ํ• ์€ ํ”ฝ์…€ ์ˆ˜์ค€์—์„œ ๊ฐ์ฒด๋ฅผ ๊ฐ์ง€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด๋ฏธ์ง€ ๋ถ„ํ• ์—๋Š” ์—ฌ๋Ÿฌ ์œ ํ˜•์ด ์žˆ์Šต๋‹ˆ๋‹ค:

  • ์ธ์Šคํ„ด์Šค ๋ถ„ํ• : ๊ฐœ์ฒด์˜ ํด๋ž˜์Šค๋ฅผ ๋ ˆ์ด๋ธ”๋งํ•˜๋Š” ๊ฒƒ ์™ธ์—๋„, ๊ฐœ์ฒด์˜ ๊ฐ ๊ตฌ๋ถ„๋œ ์ธ์Šคํ„ด์Šค์—๋„ ๋ ˆ์ด๋ธ”์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค (โ€œ๊ฐœ-1โ€, โ€œ๊ฐœ-2โ€ ๋“ฑ).
  • ํŒŒ๋†‰ํ‹ฑ ๋ถ„ํ• : ์˜๋ฏธ์  ๋ถ„ํ• ๊ณผ ์ธ์Šคํ„ด์Šค ๋ถ„ํ• ์˜ ์กฐํ•ฉ์ž…๋‹ˆ๋‹ค. ๊ฐ ํ”ฝ์…€์„ ์˜๋ฏธ์  ํด๋ž˜์Šค๋กœ ๋ ˆ์ด๋ธ”๋งํ•˜๋Š” ๋™์‹œ์— ๊ฐœ์ฒด์˜ ๊ฐ๊ฐ ๊ตฌ๋ถ„๋œ ์ธ์Šคํ„ด์Šค๋กœ๋„ ๋ ˆ์ด๋ธ”์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

๋ถ„ํ•  ์ž‘์—…์€ ์ž์œจ ์ฃผํ–‰ ์ฐจ๋Ÿ‰์—์„œ ์œ ์šฉํ•˜๋ฉฐ, ์ฃผ๋ณ€ ํ™˜๊ฒฝ์˜ ํ”ฝ์…€ ์ˆ˜์ค€ ์ง€๋„๋ฅผ ์ƒ์„ฑํ•˜์—ฌ ๋ณดํ–‰์ž์™€ ๋‹ค๋ฅธ ์ฐจ๋Ÿ‰ ์ฃผ๋ณ€์—์„œ ์•ˆ์ „ํ•˜๊ฒŒ ํƒ์ƒ‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ์˜๋ฃŒ ์˜์ƒ์—์„œ๋„ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋ถ„ํ•  ์ž‘์—…์ด ํ”ฝ์…€ ์ˆ˜์ค€์—์„œ ๊ฐ์ฒด๋ฅผ ๊ฐ์ง€ํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋น„์ •์ƒ์ ์ธ ์„ธํฌ๋‚˜ ์žฅ๊ธฐ์˜ ํŠน์ง•์„ ์‹๋ณ„ํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€ ๋ถ„ํ• ์€ ์˜๋ฅ˜ ๊ฐ€์ƒ ์‹œ์ฐฉ์ด๋‚˜ ์นด๋ฉ”๋ผ๋ฅผ ํ†ตํ•ด ์‹ค์ œ ์„ธ๊ณ„์— ๊ฐ€์ƒ ๊ฐœ์ฒด๋ฅผ ๋ง์”Œ์›Œ ์ฆ๊ฐ• ํ˜„์‹ค ๊ฒฝํ—˜์„ ๋งŒ๋“œ๋Š” ๋“ฑ ์ „์ž ์ƒ๊ฑฐ๋ž˜ ๋ถ„์•ผ์—์„œ๋„ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

>>> from transformers import pipeline

>>> segmenter = pipeline(task="image-segmentation")
>>> preds = segmenter(
...     "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
... )
>>> preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
>>> print(*preds, sep="\n")
{'score': 0.9879, 'label': 'LABEL_184'}
{'score': 0.9973, 'label': 'snow'}
{'score': 0.9972, 'label': 'cat'}

๊นŠ์ด ์ถ”์ •

๊นŠ์ด ์ถ”์ •์€ ์นด๋ฉ”๋ผ๋กœ๋ถ€ํ„ฐ ์ด๋ฏธ์ง€ ๋‚ด๋ถ€์˜ ๊ฐ ํ”ฝ์…€์˜ ๊ฑฐ๋ฆฌ๋ฅผ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค. ์ด ์ปดํ“จํ„ฐ ๋น„์ „ ์ž‘์—…์€ ํŠนํžˆ ์žฅ๋ฉด ์ดํ•ด์™€ ์žฌ๊ตฌ์„ฑ์— ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ž์œจ ์ฃผํ–‰ ์ฐจ๋Ÿ‰์€ ๋ณดํ–‰์ž, ๊ตํ†ต ํ‘œ์ง€ํŒ ๋ฐ ๋‹ค๋ฅธ ์ฐจ๋Ÿ‰๊ณผ ๊ฐ™์€ ๊ฐ์ฒด์™€์˜ ๊ฑฐ๋ฆฌ๋ฅผ ์ดํ•ดํ•˜์—ฌ ์žฅ์• ๋ฌผ๊ณผ ์ถฉ๋Œ์„ ํ”ผํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๊นŠ์ด ์ •๋ณด๋Š” ๋˜ํ•œ 2D ์ด๋ฏธ์ง€์—์„œ 3D ํ‘œํ˜„์„ ๊ตฌ์„ฑํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋˜๋ฉฐ ์ƒ๋ฌผํ•™์  ๊ตฌ์กฐ๋‚˜ ๊ฑด๋ฌผ์˜ ๊ณ ํ’ˆ์งˆ 3D ํ‘œํ˜„์„ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊นŠ์ด ์ถ”์ •์—๋Š” ๋‘ ๊ฐ€์ง€ ์ ‘๊ทผ ๋ฐฉ์‹์ด ์žˆ์Šต๋‹ˆ๋‹ค:

  • ์Šคํ…Œ๋ ˆ์˜ค: ์•ฝ๊ฐ„ ๋‹ค๋ฅธ ๊ฐ๋„์—์„œ ์ดฌ์˜๋œ ๋™์ผํ•œ ์ด๋ฏธ์ง€ ๋‘ ์žฅ์„ ๋น„๊ตํ•˜์—ฌ ๊นŠ์ด๋ฅผ ์ถ”์ •ํ•ฉ๋‹ˆ๋‹ค.
  • ๋‹จ์•ˆ: ๋‹จ์ผ ์ด๋ฏธ์ง€์—์„œ ๊นŠ์ด๋ฅผ ์ถ”์ •ํ•ฉ๋‹ˆ๋‹ค.
>>> from transformers import pipeline

>>> depth_estimator = pipeline(task="depth-estimation")
>>> preds = depth_estimator(
...     "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
... )

์ž์—ฐ์–ด์ฒ˜๋ฆฌ

ํ…์ŠคํŠธ๋Š” ์ธ๊ฐ„์ด ์˜์‚ฌ ์†Œํ†ตํ•˜๋Š” ์ž์—ฐ์Šค๋Ÿฌ์šด ๋ฐฉ์‹ ์ค‘ ํ•˜๋‚˜์ด๊ธฐ ๋•Œ๋ฌธ์— ์ž์—ฐ์–ด์ฒ˜๋ฆฌ ์—ญ์‹œ ๊ฐ€์žฅ ์ผ๋ฐ˜์ ์ธ ์ž‘์—… ์œ ํ˜• ์ค‘ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ์ด ์ธ์‹ํ•˜๋Š” ํ˜•์‹์œผ๋กœ ํ…์ŠคํŠธ๋ฅผ ๋ณ€ํ™˜ํ•˜๋ ค๋ฉด ํ† ํฐํ™”ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ํ…์ŠคํŠธ ์‹œํ€€์Šค๋ฅผ ๊ฐœ๋ณ„ ๋‹จ์–ด ๋˜๋Š” ํ•˜์œ„ ๋‹จ์–ด(ํ† ํฐ)๋กœ ๋ถ„ํ• ํ•œ ๋‹ค์Œ ์ด๋Ÿฌํ•œ ํ† ํฐ์„ ์ˆซ์ž๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ํ…์ŠคํŠธ ์‹œํ€€์Šค๋ฅผ ์ˆซ์ž ์‹œํ€€์Šค๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ˆซ์ž ์‹œํ€€์Šค๋ฅผ ๋‹ค์–‘ํ•œ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ ์ž‘์—…์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ๋ชจ๋ธ์— ์ž…๋ ฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค!

ํ…์ŠคํŠธ ๋ถ„๋ฅ˜

๋‹ค๋ฅธ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ์—์„œ์˜ ๋ถ„๋ฅ˜ ์ž‘์—…๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ํ…์ŠคํŠธ ๋ถ„๋ฅ˜๋Š” ๋ฏธ๋ฆฌ ์ •์˜๋œ ํด๋ž˜์Šค ์ง‘ํ•ฉ์—์„œ ํ…์ŠคํŠธ ์‹œํ€€์Šค(๋ฌธ์žฅ ์ˆ˜์ค€, ๋‹จ๋ฝ ๋˜๋Š” ๋ฌธ์„œ ๋“ฑ)์— ๋ ˆ์ด๋ธ”์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ํ…์ŠคํŠธ ๋ถ„๋ฅ˜์—๋Š” ๋‹ค์–‘ํ•œ ์‹ค์šฉ์ ์ธ ์‘์šฉ ์‚ฌ๋ก€๊ฐ€ ์žˆ์œผ๋ฉฐ, ์ผ๋ถ€ ์˜ˆ์‹œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

  • ๊ฐ์„ฑ ๋ถ„์„: ํ…์ŠคํŠธ๋ฅผ ๊ธ์ • ๋˜๋Š” ๋ถ€์ •๊ณผ ๊ฐ™์€ ์–ด๋–ค ๊ทน์„ฑ์— ๋”ฐ๋ผ ๋ ˆ์ด๋ธ”๋งํ•˜์—ฌ ์ •์น˜, ๊ธˆ์œต, ๋งˆ์ผ€ํŒ…๊ณผ ๊ฐ™์€ ๋ถ„์•ผ์—์„œ ์˜์‚ฌ ๊ฒฐ์ •์— ์ •๋ณด๋ฅผ ์ œ๊ณตํ•˜๊ณ  ์ง€์›ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ฝ˜ํ…์ธ  ๋ถ„๋ฅ˜: ํ…์ŠคํŠธ๋ฅผ ์ฃผ์ œ์— ๋”ฐ๋ผ ๋ ˆ์ด๋ธ”๋ง(๋‚ ์”จ, ์Šคํฌ์ธ , ๊ธˆ์œต ๋“ฑ)ํ•˜์—ฌ ๋‰ด์Šค ๋ฐ ์†Œ์…œ ๋ฏธ๋””์–ด ํ”ผ๋“œ์—์„œ ์ •๋ณด๋ฅผ ๊ตฌ์„ฑํ•˜๊ณ  ํ•„ํ„ฐ๋งํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
>>> from transformers import pipeline

>>> classifier = pipeline(task="sentiment-analysis")
>>> preds = classifier("Hugging Face is the best thing since sliced bread!")
>>> preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
>>> preds
[{'score': 0.9991, 'label': 'POSITIVE'}]

ํ† ํฐ ๋ถ„๋ฅ˜

๋ชจ๋“  ์ž์—ฐ์–ด์ฒ˜๋ฆฌ ์ž‘์—…์—์„œ๋Š” ํ…์ŠคํŠธ๊ฐ€ ๊ฐœ๋ณ„ ๋‹จ์–ด๋‚˜ ํ•˜์œ„ ๋‹จ์–ด๋กœ ๋ถ„๋ฆฌ๋˜์–ด ์ „์ฒ˜๋ฆฌ๋ฉ๋‹ˆ๋‹ค. ๋ถ„๋ฆฌ๋œ ๋‹จ์–ด๋ฅผ ํ† ํฐ์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ํ† ํฐ ๋ถ„๋ฅ˜๋Š” ๊ฐ ํ† ํฐ์— ๋ฏธ๋ฆฌ ์ •์˜๋œ ํด๋ž˜์Šค ์ง‘ํ•ฉ์˜ ๋ ˆ์ด๋ธ”์„ ํ• ๋‹นํ•ฉ๋‹ˆ๋‹ค.

ํ† ํฐ ๋ถ„๋ฅ˜์˜ ๋‘ ๊ฐ€์ง€ ์ผ๋ฐ˜์ ์ธ ์œ ํ˜•์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

  • ๊ฐœ์ฒด๋ช… ์ธ์‹ (NER): ํ† ํฐ์„ ์กฐ์ง, ์ธ๋ฌผ, ์œ„์น˜ ๋˜๋Š” ๋‚ ์งœ์™€ ๊ฐ™์€ ๊ฐœ์ฒด ๋ฒ”์ฃผ์— ๋”ฐ๋ผ ๋ ˆ์ด๋ธ”๋งํ•ฉ๋‹ˆ๋‹ค. NER์€ ํŠนํžˆ ์œ ์ „์ฒดํ•™์ ์ธ ํ™˜๊ฒฝ์—์„œ ์œ ์ „์ž, ๋‹จ๋ฐฑ์งˆ ๋ฐ ์•ฝ๋ฌผ ์ด๋ฆ„์— ๋ ˆ์ด๋ธ”์„ ์ง€์ •ํ•˜๋Š” ๋ฐ ๋„๋ฆฌ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
  • ํ’ˆ์‚ฌ ํƒœ๊น… (POS): ๋ช…์‚ฌ, ๋™์‚ฌ, ํ˜•์šฉ์‚ฌ์™€ ๊ฐ™์€ ํ’ˆ์‚ฌ์— ๋”ฐ๋ผ ํ† ํฐ์— ๋ ˆ์ด๋ธ”์„ ํ• ๋‹นํ•ฉ๋‹ˆ๋‹ค. POS๋Š” ๋ฒˆ์—ญ ์‹œ์Šคํ…œ์ด ๋™์ผํ•œ ๋‹จ์–ด๊ฐ€ ๋ฌธ๋ฒ•์ ์œผ๋กœ ์–ด๋–ป๊ฒŒ ๋‹ค๋ฅธ์ง€ ์ดํ•ดํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค (๋ช…์‚ฌ๋กœ ์‚ฌ์šฉ๋˜๋Š” โ€œbank(์€ํ–‰)โ€œ๊ณผ ๋™์‚ฌ๋กœ ์‚ฌ์šฉ๋˜๋Š” โ€œbank(์˜ˆ๊ธˆ์„ ์˜ˆ์น˜ํ•˜๋‹ค)โ€œ๊ณผ ๊ฐ™์€ ๊ฒฝ์šฐ).
>>> from transformers import pipeline

>>> classifier = pipeline(task="ner")
>>> preds = classifier("Hugging Face is a French company based in New York City.")
>>> preds = [
...     {
...         "entity": pred["entity"],
...         "score": round(pred["score"], 4),
...         "index": pred["index"],
...         "word": pred["word"],
...         "start": pred["start"],
...         "end": pred["end"],
...     }
...     for pred in preds
... ]
>>> print(*preds, sep="\n")
{'entity': 'I-ORG', 'score': 0.9968, 'index': 1, 'word': 'Hu', 'start': 0, 'end': 2}
{'entity': 'I-ORG', 'score': 0.9293, 'index': 2, 'word': '##gging', 'start': 2, 'end': 7}
{'entity': 'I-ORG', 'score': 0.9763, 'index': 3, 'word': 'Face', 'start': 8, 'end': 12}
{'entity': 'I-MISC', 'score': 0.9983, 'index': 6, 'word': 'French', 'start': 18, 'end': 24}
{'entity': 'I-LOC', 'score': 0.999, 'index': 10, 'word': 'New', 'start': 42, 'end': 45}
{'entity': 'I-LOC', 'score': 0.9987, 'index': 11, 'word': 'York', 'start': 46, 'end': 50}
{'entity': 'I-LOC', 'score': 0.9992, 'index': 12, 'word': 'City', 'start': 51, 'end': 55}

์งˆ์˜์‘๋‹ต

์งˆ์˜์‘๋‹ต์€ ๋˜ ํ•˜๋‚˜์˜ ํ† ํฐ ์ฐจ์›์˜ ์ž‘์—…์œผ๋กœ, ๋ฌธ๋งฅ์ด ์žˆ์„ ๋•Œ(๊ฐœ๋ฐฉํ˜• ๋„๋ฉ”์ธ)์™€ ๋ฌธ๋งฅ์ด ์—†์„ ๋•Œ(ํ์‡„ํ˜• ๋„๋ฉ”์ธ) ์งˆ๋ฌธ์— ๋Œ€ํ•œ ๋‹ต๋ณ€์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ์ด ์ž‘์—…์€ ๊ฐ€์ƒ ๋น„์„œ์—๊ฒŒ ์‹๋‹น์ด ์˜์—… ์ค‘์ธ์ง€์™€ ๊ฐ™์€ ์งˆ๋ฌธ์„ ํ•  ๋•Œ๋งˆ๋‹ค ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ณ ๊ฐ ์ง€์› ๋˜๋Š” ๊ธฐ์ˆ  ์ง€์›์„ ์ œ๊ณตํ•˜๊ฑฐ๋‚˜ ๊ฒ€์ƒ‰ ์—”์ง„์ด ์š”์ฒญํ•œ ์ •๋ณด๋ฅผ ๊ฒ€์ƒ‰ํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์งˆ๋ฌธ ๋‹ต๋ณ€์—๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ๋‘ ๊ฐ€์ง€ ์œ ํ˜•์ด ์žˆ์Šต๋‹ˆ๋‹ค:

  • ์ถ”์ถœํ˜•: ์งˆ๋ฌธ๊ณผ ๋ฌธ๋งฅ์ด ์ฃผ์–ด์กŒ์„ ๋•Œ, ๋ชจ๋ธ์ด ์ฃผ์–ด์ง„ ๋ฌธ๋งฅ์˜ ์ผ๋ถ€์—์„œ ๊ฐ€์ ธ์˜จ ํ…์ŠคํŠธ์˜ ๋ฒ”์œ„๋ฅผ ๋‹ต๋ณ€์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค.
  • ์ƒ์„ฑํ˜•: ์งˆ๋ฌธ๊ณผ ๋ฌธ๋งฅ์ด ์ฃผ์–ด์กŒ์„ ๋•Œ, ์ฃผ์–ด์ง„ ๋ฌธ๋งฅ์„ ํ†ตํ•ด ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ด ์ ‘๊ทผ ๋ฐฉ์‹์€ QuestionAnsweringPipeline ๋Œ€์‹  Text2TextGenerationPipeline์„ ํ†ตํ•ด ์ฒ˜๋ฆฌ๋ฉ๋‹ˆ๋‹ค.
>>> from transformers import pipeline

>>> question_answerer = pipeline(task="question-answering")
>>> preds = question_answerer(
...     question="What is the name of the repository?",
...     context="The name of the repository is huggingface/transformers",
... )
>>> print(
...     f"score: {round(preds['score'], 4)}, start: {preds['start']}, end: {preds['end']}, answer: {preds['answer']}"
... )
score: 0.9327, start: 30, end: 54, answer: huggingface/transformers

์š”์•ฝ

์š”์•ฝ์€ ์›๋ณธ ๋ฌธ์„œ์˜ ์˜๋ฏธ๋ฅผ ์ตœ๋Œ€ํ•œ ๋ณด์กดํ•˜๋ฉด์„œ ๊ธด ๋ฌธ์„œ๋ฅผ ์งง์€ ๋ฌธ์„œ๋กœ ๋งŒ๋“œ๋Š” ์ž‘์—…์ž…๋‹ˆ๋‹ค. ์š”์•ฝ์€ sequence-to-sequence ์ž‘์—…์ž…๋‹ˆ๋‹ค. ์ž…๋ ฅ๋ณด๋‹ค ์งง์€ ํ…์ŠคํŠธ ์‹œํ€€์Šค๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค. ์š”์•ฝ ์ž‘์—…์€ ๋…์ž๊ฐ€ ์žฅ๋ฌธ ๋ฌธ์„œ๋“ค์˜ ์ฃผ์š” ํฌ์ธํŠธ๋ฅผ ๋น ๋ฅด๊ฒŒ ์ดํ•ดํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž…๋ฒ•์•ˆ, ๋ฒ•๋ฅ  ๋ฐ ๊ธˆ์œต ๋ฌธ์„œ, ํŠนํ—ˆ ๋ฐ ๊ณผํ•™ ๋…ผ๋ฌธ์€ ์š”์•ฝ ์ž‘์—…์ด ๋…์ž์˜ ์‹œ๊ฐ„์„ ์ ˆ์•ฝํ•˜๊ณ  ๋…์„œ ๋ณด์กฐ ๋„๊ตฌ๋กœ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ๋Š” ๋ช‡ ๊ฐ€์ง€ ์˜ˆ์‹œ์ž…๋‹ˆ๋‹ค.

์งˆ๋ฌธ ๋‹ต๋ณ€๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ์š”์•ฝ์—๋Š” ๋‘ ๊ฐ€์ง€ ์œ ํ˜•์ด ์žˆ์Šต๋‹ˆ๋‹ค:

  • ์ถ”์ถœํ˜•: ์›๋ณธ ํ…์ŠคํŠธ์—์„œ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ๋ฌธ์žฅ์„ ์‹๋ณ„ํ•˜๊ณ  ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
  • ์ƒ์„ฑํ˜•: ์›๋ณธ ํ…์ŠคํŠธ์—์„œ ๋ชฉํ‘œ ์š”์•ฝ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ž…๋ ฅ ๋ฌธ์„œ์— ์—†๋Š” ์ƒˆ๋กœ์šด ๋‹จ์–ด๋ฅผ ํฌํ•จํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. SummarizationPipeline์€ ์ƒ์„ฑํ˜• ์ ‘๊ทผ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
>>> from transformers import pipeline

>>> summarizer = pipeline(task="summarization")
>>> summarizer(
...     "In this work, we presented the Transformer, the first sequence transduction model based entirely on attention, replacing the recurrent layers most commonly used in encoder-decoder architectures with multi-headed self-attention. For translation tasks, the Transformer can be trained significantly faster than architectures based on recurrent or convolutional layers. On both WMT 2014 English-to-German and WMT 2014 English-to-French translation tasks, we achieve a new state of the art. In the former task our best model outperforms even all previously reported ensembles."
... )
[{'summary_text': ' The Transformer is the first sequence transduction model based entirely on attention . It replaces the recurrent layers most commonly used in encoder-decoder architectures with multi-headed self-attention . For translation tasks, the Transformer can be trained significantly faster than architectures based on recurrent or convolutional layers .'}]

๋ฒˆ์—ญ

๋ฒˆ์—ญ์€ ํ•œ ์–ธ์–ด๋กœ ๋œ ํ…์ŠคํŠธ ์‹œํ€€์Šค๋ฅผ ๋‹ค๋ฅธ ์–ธ์–ด๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์ž‘์—…์ž…๋‹ˆ๋‹ค. ์ด๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ๋ฐฐ๊ฒฝ์„ ๊ฐ€์ง„ ์‚ฌ๋žŒ๋“ค์ด ์„œ๋กœ ์†Œํ†ตํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ฃผ๋Š” ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค. ๋” ๋„“์€ ๋Œ€์ค‘์—๊ฒŒ ์ฝ˜ํ…์ธ ๋ฅผ ๋ฒˆ์—ญํ•˜์—ฌ ์ „๋‹ฌํ•˜๊ฑฐ๋‚˜, ์ƒˆ๋กœ์šด ์–ธ์–ด๋ฅผ ๋ฐฐ์šฐ๋Š” ๋ฐ ๋„์›€์ด ๋˜๋Š” ํ•™์Šต ๋„๊ตฌ๊ฐ€ ๋  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์š”์•ฝ๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ, ๋ฒˆ์—ญ์€ sequence-to-sequence ์ž‘์—…์ž…๋‹ˆ๋‹ค. ์ฆ‰, ๋ชจ๋ธ์€ ์ž…๋ ฅ ์‹œํ€€์Šค๋ฅผ ๋ฐ›์•„์„œ ์ถœ๋ ฅ์ด ๋˜๋Š” ๋ชฉํ‘œ ์‹œํ€€์Šค๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

์ดˆ๊ธฐ์˜ ๋ฒˆ์—ญ ๋ชจ๋ธ์€ ๋Œ€๋ถ€๋ถ„ ๋‹จ์ผ ์–ธ์–ด๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์—ˆ์ง€๋งŒ, ์ตœ๊ทผ์—๋Š” ๋งŽ์€ ์–ธ์–ด ์Œ ๊ฐ„์— ๋ฒˆ์—ญ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค์ค‘ ์–ธ์–ด ๋ชจ๋ธ์— ๋Œ€ํ•œ ๊ด€์‹ฌ์ด ๋†’์•„์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

>>> from transformers import pipeline

>>> text = "translate English to French: Hugging Face is a community-based open-source platform for machine learning."
>>> translator = pipeline(task="translation", model="t5-small")
>>> translator(text)
[{'translation_text': "Hugging Face est une tribune communautaire de l'apprentissage des machines."}]

์–ธ์–ด ๋ชจ๋ธ๋ง

์–ธ์–ด ๋ชจ๋ธ๋ง์€ ํ…์ŠคํŠธ ์‹œํ€€์Šค์—์„œ ๋‹จ์–ด๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ์ž‘์—…์ž…๋‹ˆ๋‹ค. ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ์–ธ์–ด ๋ชจ๋ธ์€ ๋งŽ์€ ๋‹ค๋ฅธ ํ•˜์œ„ ์ž‘์—…์— ๋”ฐ๋ผ ๋ฏธ์„ธ ์กฐ์ •๋  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋งค์šฐ ์ธ๊ธฐ ์žˆ๋Š” ์ž์—ฐ์–ด์ฒ˜๋ฆฌ ์ž‘์—…์ด ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ตœ๊ทผ์—๋Š” ์ œ๋กœ ์ƒท(zero-shot) ๋˜๋Š” ํ“จ ์ƒท(few-shot) ํ•™์Šต์ด ๊ฐ€๋Šฅํ•œ ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ(Large Language Models, LLM)์— ๋Œ€ํ•œ ๋งŽ์€ ๊ด€์‹ฌ์ด ๋ฐœ์ƒํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๋ชจ๋ธ์ด ๋ช…์‹œ์ ์œผ๋กœ ํ›ˆ๋ จ๋˜์ง€ ์•Š์€ ์ž‘์—…๋„ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค! ์–ธ์–ด ๋ชจ๋ธ์€ ์œ ์ฐฝํ•˜๊ณ  ์„ค๋“๋ ฅ ์žˆ๋Š” ํ…์ŠคํŠธ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์ง€๋งŒ, ํ…์ŠคํŠธ๊ฐ€ ํ•ญ์ƒ ์ •ํ™•ํ•˜์ง€๋Š” ์•Š์„ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์ฃผ์˜๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

์–ธ์–ด ๋ชจ๋ธ๋ง์—๋Š” ๋‘ ๊ฐ€์ง€ ์œ ํ˜•์ด ์žˆ์Šต๋‹ˆ๋‹ค:

  • ์ธ๊ณผ์  ์–ธ์–ด ๋ชจ๋ธ๋ง: ์ด ๋ชจ๋ธ์˜ ๋ชฉ์ ์€ ์‹œํ€€์Šค์—์„œ ๋‹ค์Œ ํ† ํฐ์„ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด๋ฉฐ, ๋ฏธ๋ž˜ ํ† ํฐ์ด ๋งˆ์Šคํ‚น ๋ฉ๋‹ˆ๋‹ค.

    >>> from transformers import pipeline
    
    >>> prompt = "Hugging Face is a community-based open-source platform for machine learning."
    >>> generator = pipeline(task="text-generation")
    >>> generator(prompt)  # doctest: +SKIP
  • ๋งˆ์Šคํ‚น๋œ ์–ธ์–ด ๋ชจ๋ธ๋ง: ์ด ๋ชจ๋ธ์˜ ๋ชฉ์ ์€ ์‹œํ€€์Šค ๋‚ด์˜ ๋งˆ์Šคํ‚น๋œ ํ† ํฐ์„ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด๋ฉฐ, ์‹œํ€€์Šค ๋‚ด์˜ ๋ชจ๋“  ํ† ํฐ์— ๋Œ€ํ•œ ์ ‘๊ทผ์ด ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค.

    >>> text = "Hugging Face is a community-based open-source <mask> for machine learning."
    >>> fill_mask = pipeline(task="fill-mask")
    >>> preds = fill_mask(text, top_k=1)
    >>> preds = [
    ...     {
    ...         "score": round(pred["score"], 4),
    ...         "token": pred["token"],
    ...         "token_str": pred["token_str"],
    ...         "sequence": pred["sequence"],
    ...     }
    ...     for pred in preds
    ... ]
    >>> preds
    [{'score': 0.2236,
      'token': 1761,
      'token_str': ' platform',
      'sequence': 'Hugging Face is a community-based open-source platform for machine learning.'}]

์ด ํŽ˜์ด์ง€๋ฅผ ํ†ตํ•ด ๊ฐ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ์˜ ๋‹ค์–‘ํ•œ ์ž‘์—… ์œ ํ˜•๊ณผ ๊ฐ ์ž‘์—…์˜ ์‹ค์šฉ์  ์ค‘์š”์„ฑ์— ๋Œ€ํ•ด ์ถ”๊ฐ€์ ์ธ ๋ฐฐ๊ฒฝ ์ •๋ณด๋ฅผ ์–ป์œผ์…จ๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค. ๋‹ค์Œ ์„น์…˜์—์„œ๋Š” ๐Ÿค— Transformer๊ฐ€ ์ด๋Ÿฌํ•œ ์ž‘์—…์„ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.