ð€ Hubã§ã®ã¹ãŒããŒãã£ãŒãžãæ€çŽ¢
'ð€ Hubã®é«éæ€çŽ¢'
huggingface_hub
ã©ã€ãã©ãªã¯ããã¹ãã£ã³ã°ãšã³ããã€ã³ãïŒã¢ãã«ãããŒã¿ã»ãããã¹ããŒã¹ïŒãæ¢çŽ¢ããããã®ããã°ã©ã çãªã¢ãããŒããæäŸãã軜éãªã€ã³ã¿ãã§ãŒã¹ã§ãã
ãããŸã§ã¯ããã®ã€ã³ã¿ãã§ãŒã¹ãä»ããŠããã§ã®æ€çŽ¢ã¯é£ããããŠãŒã¶ãŒã¯ãç¥ã£ãŠããã ããã§æ £ããªããã°ãªããªãå€ãã®åŽé¢ããããŸããã
ãã®èšäºã§ã¯ãhuggingface_hub
ã«è¿œå ãããããã€ãã®æ°æ©èœã玹ä»ãããŠãŒã¶ãŒã«JupyterãPythonã€ã³ã¿ãã§ãŒã¹ãé¢ããã«äœ¿çšãããã¢ãã«ãããŒã¿ã»ãããæ€çŽ¢ããããã®ãã¬ã³ããªãŒãªAPIãæäŸããŸãã
- ð€ Transformersã䜿çšããŠãWav2Vec2ã䜿çšããŠå€§èŠæš¡ãªãã¡ã€ã«ã§èªåé³å£°èªèãè¡ãæ¹æ³
- Pythonã䜿çšããææ åæã®å§ãæ¹
- ð€ Transformersã䜿çšããŠãç»ååé¡ã®ããã«ViTã埮調æŽãã
å§ããåã«ãã·ã¹ãã ã«ææ°ããŒãžã§ã³ã®
huggingface_hub
ã©ã€ãã©ãªããªãå Žåã¯ã次ã®ã»ã«ãå®è¡ããŠãã ããïŒ
!pip install huggingface_hub -U
åé¡ã®äœçœ®ã¥ãïŒ
ãŸããèªåãã©ã®ãããªã·ããªãªã«ãããæ³åããŠã¿ãŸããããããã¹ãåé¡ã®ããã«ããã§ãã¹ããããŠãããã¹ãŠã®ã¢ãã«ãèŠã€ããããšããŸãããããã®ã¢ãã«ã¯GLUEããŒã¿ã»ããã§ãã¬ãŒãã³ã°ãããPyTorchãšäºææ§ããããŸãã
https://huggingface.co/models ãåã«éããŠããã«ãããŠã£ãžã§ããã䜿çšããããšãã§ããŸããããããããã«ããIDEãé¢ããŠçµæãã¹ãã£ã³ããå¿ èŠããããŸãããå¿ èŠãªæ å ±ãåŸãããã«ã¯ããã€ãã®ãã¿ã³ã¯ãªãã¯ãå¿ èŠã§ãã
ãããIDEãé¢ããã«ããã解決ããæ¹æ³ããã£ããã©ãã§ããããïŒããã°ã©ã çãªã€ã³ã¿ãã§ãŒã¹ã§ããã°ããããæ¢çŽ¢ããããã®ã¯ãŒã¯ãããŒã«ãç°¡åã«çµã¿èŸŒãããããããŸããã
ããã§huggingface_hub
ãç»å ŽããŸãã
ãã®ã©ã€ãã©ãªã«æ £ããŠããæ¹ã¯ããã§ã«ãã®çš®ã®ã¢ãã«ãæ€çŽ¢ã§ããããšãç¥ã£ãŠãããããããŸãããããããã¯ãšãªãæ£ããååŸããããšã¯è©Šè¡é¯èª€ã®çãŸããããã»ã¹ã§ãã
ãããç°¡ç¥åããããšã¯ã§ããã§ããããïŒãããèŠãŠã¿ãŸãããïŒ
å¿ èŠãªãã®ãèŠã€ãã
ãŸããHfApi
ãã€ã³ããŒãããŸããããã¯Hugging Faceã®ããã¯ãšã³ããã¹ãã£ã³ã°ãšå¯Ÿè©±ããã®ã«åœ¹ç«ã€ã¯ã©ã¹ã§ããã¢ãã«ãããŒã¿ã»ãããªã©ãéããŠå¯Ÿè©±ããããšãã§ããŸããããã«ãããã€ãã®ãã«ããŒã¯ã©ã¹ãã€ã³ããŒãããŸãïŒModelFilter
ãšModelSearchArguments
from huggingface_hub import HfApi, ModelFilter, ModelSearchArguments
api = HfApi()
ãããã®2ã€ã®ã¯ã©ã¹ã¯ãäžèšã®åé¡ã«å¯Ÿãã解決çããã¬ãŒã åããã®ã«åœ¹ç«ã¡ãŸãã ModelSearchArguments
ã¯ã©ã¹ã¯ãæ€çŽ¢ã§ãããã¹ãŠã®æå¹ãªãã©ã¡ãŒã¿ãå«ãããŒã ã¹ããŒã¹ã®ãããªã¯ã©ã¹ã§ãã
ã®ãããŠã¿ãŸãããïŒ
>>> model_args = ModelSearchArguments()
>>> model_args
å©çšå¯èœãªå±æ§ãŸãã¯ããŒïŒ
* author
* dataset
* language
* library
* license
* model_name
* pipeline_tag
å©çšã§ããããŸããŸãªå±æ§ãããããŸãïŒãã®ããžãã¯ãã©ã®ããã«è¡ããããã«ã€ããŠã¯åŸã»ã©è©³ãã説æããŸãïŒãç§ãã¡ãæ±ãããã®ãã«ããŽãªãŒåããããšããããã以äžã®ããã«åããããšãã§ããã§ãããïŒ
pipeline_tag
ïŒãŸãã¯ã¿ã¹ã¯ïŒïŒããã¹ãåé¡dataset
ïŒGLUElibrary
ïŒPyTorch
ãã®ãããªåé¡ãèãããšã宣èšããmodel_args
ã®äžã«ããããèŠã€ããããšãã§ããã¯ãã§ãïŒ
>>> model_args.pipeline_tag.TextClassification
'text-classification'
>>> model_args.dataset.glue
'dataset:glue'
>>> model_args.library.PyTorch
'pytorch'
ããããããã§ããã€ãã®äŸ¿å©ãªã©ããã³ã°ãè¡ãããŠããããšã«æ°ä»ãå§ããŸãã ModelSearchArguments
ïŒããã³è£å®ãããDatasetSearchArguments
ïŒã¯ãAPIãæ±ãã圢åŒåãããåºåã«åããã人éãèªã¿ãããã€ã³ã¿ãã§ãŒã¹ãæã£ãŠãããGLUEããŒã¿ã»ãããdataset:glue
ã§æ€çŽ¢ããæ¹æ³ãªã©ãå«ãŸããŸãã
ãããéèŠã§ãããªããªããç¹å®ã®ãã©ã¡ãŒã¿ãã©ã®ããã«èšè¿°ããã°ããããç¥ããªãå ŽåãAPIã§ã¢ãã«ãæ€çŽ¢ããããšããŠããéã«éåžžã«ç°¡åã«ã€ã©ã€ã©ããããšãã§ããããã§ãïŒ
æ£ãããã©ã¡ãŒã¿ãããã£ãã®ã§ãç°¡åã«APIãæ€çŽ¢ã§ããŸãïŒ
>>> models = api.list_models(filter = (
>>> model_args.pipeline_tag.TextClassification,
>>> model_args.dataset.glue,
>>> model_args.library.PyTorch)
>>> )
>>> print(len(models))
140
ç§ãã¡ã®åºæºã«åèŽãã140åã®äžèŽããã¢ãã«ãèŠã€ãããŸããïŒïŒãã®èšäºãæžããæç¹ã§ã®æ å ±ã§ãïŒããããŠã1ã€ããã詳ããèŠãŠã¿ããšã確ãã«æ£ããããã§ãïŒ
>>> models[0]
ModelInfo: {
modelId: Jiva/xlm-roberta-large-it-mnli
sha: c6e64469ec4aa17fedbd1b2522256f90a90b5b86
lastModified: 2021-12-10T14:56:38.000Z
tags: ['pytorch', 'xlm-roberta', 'text-classification', 'it', 'dataset:multi_nli', 'dataset:glue', 'arxiv:1911.02116', 'transformers', 'tensorflow', 'license:mit', 'zero-shot-classification']
pipeline_tag: zero-shot-classification
siblings: [ModelFile(rfilename='.gitattributes'), ModelFile(rfilename='README.md'), ModelFile(rfilename='config.json'), ModelFile(rfilename='pytorch_model.bin'), ModelFile(rfilename='sentencepiece.bpe.model'), ModelFile(rfilename='special_tokens_map.json'), ModelFile(rfilename='tokenizer.json'), ModelFile(rfilename='tokenizer_config.json')]
config: None
private: False
downloads: 680
library_name: transformers
likes: 1
}
ããã¯å°ãèªã¿ããããªãã”ãã®ãã©ã¡ãŒã¿ã¯æ£ãããã©ãã”ãæšæž¬ããå¿ èŠã¯ãããŸããã
ãã®ã¢ãã«ã®ã¢ãã«IDã䜿çšããŠããã°ã©ã ã§æ å ±ãååŸããããšãã§ããããšããåç¥ã§ãããïŒ ä»¥äžã¯ãã®æ¹æ³ã§ãïŒ
api.model_info('Jiva/xlm-roberta-large-it-mnli')
æŽã«åäžããã
Hubãæ€çŽ¢ãããšãã«æšæž¬äœæ¥ãæé€ããããã«ãModelSearchArguments
ãšDatasetSearchArguments
ã䜿çšããæ¹æ³ãèŠãŸããããéåžžã«è€éã§ä¹±éãªã¯ãšãªãæã£ãŠããå Žåã¯ã©ãã§ããããïŒ
以äžã®ãããªã¯ãšãªã§ãïŒãtext-classification
ãšzero-shot
ã®äž¡æ¹ã§ãã¬ãŒãã³ã°ãããMulti NLIãšGLUEããŒã¿ã»ããã§ãã¬ãŒãã³ã°ãããPyTorchãšTensorFlowã®äž¡æ¹ãšäºææ§ãããã¢ãã«ããã¹ãŠæ€çŽ¢ããããã
ãã®ã¯ãšãªãèšå®ããããã«ãModelFilter
ã¯ã©ã¹ã䜿çšããŸããããã¯ããã®ãããªç¶æ³ãåŠçããããã«èšèšãããŠãããããé ãæ©ãŸããå¿
èŠã¯ãããŸããïŒ
>>> filt = ModelFilter(
>>> task = ["text-classification", "zero-shot-classification"],
>>> trained_dataset = [model_args.dataset.multi_nli, model_args.dataset.glue],
>>> library = ['pytorch', 'tensorflow']
>>> )
>>> api.list_models(filt)
[ModelInfo: {
modelId: Jiva/xlm-roberta-large-it-mnli
sha: c6e64469ec4aa17fedbd1b2522256f90a90b5b86
lastModified: 2021-12-10T14:56:38.000Z
tags: ['pytorch', 'xlm-roberta', 'text-classification', 'it', 'dataset:multi_nli', 'dataset:glue', 'arxiv:1911.02116', 'transformers', 'tensorflow', 'license:mit', 'zero-shot-classification']
pipeline_tag: zero-shot-classification
siblings: [ModelFile(rfilename='.gitattributes'), ModelFile(rfilename='README.md'), ModelFile(rfilename='config.json'), ModelFile(rfilename='pytorch_model.bin'), ModelFile(rfilename='sentencepiece.bpe.model'), ModelFile(rfilename='special_tokens_map.json'), ModelFile(rfilename='tokenizer.json'), ModelFile(rfilename='tokenizer_config.json')]
config: None
private: False
downloads: 680
library_name: transformers
likes: 1
}]
éåžžã«è¿ éã«ããããŸãããAPIãæ€çŽ¢ããããã®ããå調ããã¢ãããŒãã§ãããããªãã«é çã®çš®ãè¿œå ããŸããïŒ
ããã¯äœã§ããïŒ
éåžžã«ç°¡åã«ããã®åæåãã£ã¯ã·ã§ããªã®ãããªããŒã¿åã§ããAttributeDictionary
ã«ãã£ãŠæäŸãããä»çµã¿ã«ã€ããŠè©±ããŸãã
fastcoreã©ã€ãã©ãªã®AttrDict
ã¯ã©ã¹ã«åŒ·ãã€ã³ã¹ãã€ã¢ããããã®ã¢ã€ãã¢ã®äžè¬çãªã¢ã€ãã¢ã¯ãéåžžã®ãã£ã¯ã·ã§ããªãåãããã£ã¯ã·ã§ããªå
ã®ãã¹ãŠã®ããŒã«å¯ŸããŠã¿ãè£å®ãæäŸããããšã«ãã£ãŠæ¢çŽ¢çãªããã°ã©ãã³ã°ãè¶
é«éåããããšã§ãã
åè¿°ã®ããã«ãmodel_args.dataset.glue
ã®ããã«æ¢çŽ¢ã§ãããã¹ãããããã£ã¯ã·ã§ããªãããå Žåãããã¯ããã«åŒ·åã«ãªããŸãïŒ
JavaScriptã«è©³ããæ¹ã«ã¯ã
object
ã¯ã©ã¹ã®åäœãæš¡å£ããŠããŸãã
ãã®ã·ã³ãã«ãªãŠãŒãã£ãªãã£ã¯ã©ã¹ã¯ããã¹ããããããŒã¿åãæ¢çŽ¢ãç解ããããšããéã«ãAPIãªã¯ãšã¹ãã®è¿ãå€ãªã©ãããã«ãããã®ããããŠãŒã¶ãŒäžå¿ã®ãšã¯ã¹ããªãšã³ã¹ãæäŸããããšãã§ããŸãã
åè¿°ã®ããã«ãAttrDict
ãããã€ãã®éèŠãªæ¹æ³ã§æ¡åŒµããŠããŸãïŒ
del model_args[key]
ãŸãã¯del model_args.key
ã§ããŒãåé€ã§ããŸã- åè¿°ã§èŠãããããª
__repr__
ãã ããéåžžã«éèŠãªæŠå¿µãšããŠãããŒã«æ°åãç¹æ®æåãå«ãŸããŠããå Žåã¯ããªããžã§ã¯ããšããŠã§ã¯ãªãããã£ã¯ã·ã§ããªãšããŠã€ã³ããã¯ã¹åããå¿ èŠããããšããããšã§ãã
>>> from huggingface_hub.utils.endpoint_helpers import AttributeDictionary
ããã«é¢ããéåžžã«ç°¡åãªäŸã¯ãããŒã3_c
ã®AttributeDictionary
ãããå Žåã§ãïŒ
>>> d = {"a":2, "b":3, "3_c":4}
>>> ad = AttributeDictionary(d)
>>> # å±æ§ãšããŠ
>>> ad.3_c
File "<ipython-input-6-c0fe109cf75d>", line 2
ad.3_c
^
SyntaxError: invalid token
>>> # ãã£ã¯ã·ã§ããªã®ããŒãšããŠ
>>> ad["3_c"]
4
ãŸãšã
ããã§ããã®æ°ããæ€çŽ¢APIãããã®ã¯ãŒã¯ãããŒãšæ¢çŽ¢ã«ã©ã®ããã«çŽæ¥åœ±é¿ãäžãããã«ã€ããŠããããŸããªç解ãã§ããããšãé¡ã£ãŠããŸãïŒããã«å ããŠãAttributeDictionary
ãããªãã«ãšã£ãŠæçšãªå Žæãã³ãŒãå
ã§ç¥ã£ãŠãããããããŸããã
ããããã¯ãããã®å¹ççãªæ€çŽ¢ã«é¢ããå ¬åŒããã¥ã¡ã³ãã確èªããç§ãã¡ã«ã¹ã¿ãŒãäžããããšãå¿ããªãã§ãã ããïŒ
We will continue to update VoAGI; if you have any questions or suggestions, please contact us!
Was this article helpful?
93 out of 132 found this helpful
Related articles
- BERT 101 – ææ°ã®NLPã¢ãã«ã®è§£èª¬
- ð€ Transformersã«ãããŠå¶çŽä»ãããŒã ãµãŒããçšããããã¹ãçæã®ã¬ã€ã
- Hugging Face TransformersãšAWS Inferentiaã䜿çšããŠãBERTæšè«ãé«éåãã
- Hugging Faceã§ã®Decision Transformersã®çŽ¹ä» ð€
- ~èªåèªèº«ã~ ç¹°ãè¿ããªã
- Habana LabsãšHugging FaceãææºããTransformerã¢ãã«ã®ãã¬ãŒãã³ã°ãå éåãã
- CO2æåºéãšð€ããïŒãªãŒãã£ã³ã°ã»ã¶ã»ãã£ãŒãž