ð€ Datasetsã§ã®æ°ãããªãŒãã£ãªãšããžã§ã³ã®ããã¥ã¡ã³ããŒã·ã§ã³ã玹ä»ããŸã
'ð€ Datasetsã®æ°ãããªãŒãã£ãªãšããžã§ã³ã®ããã¥ã¡ã³ããŒã·ã§ã³ã玹ä»ããŸã'
ãªãŒãã³ã§åçŸå¯èœãªããŒã¿ã»ããã¯ãè¯ãæ©æ¢°åŠç¿ãé²ããããã«äžå¯æ¬ ã§ããåæã«ãããŒã¿ã»ããã¯å€§èŠæš¡ãªèšèªã¢ãã«ã®çæãšããŠéåžžã«å€§ããæé·ããŠããŸãã2020幎ãHugging Faceã¯ð€ Datasetsãšããã©ã€ãã©ãªãç«ã¡äžãã以äžã®ããã«å°çšã®ã©ã€ãã©ãªãæäŸããŠããŸãïŒ
- 1è¡ã®ã³ãŒãã§æšæºåãããããŒã¿ã»ããã«ã¢ã¯ã»ã¹ãæäŸããããšã
- 倧èŠæš¡ãªããŒã¿ã»ãããè¿ éãã€å¹ççã«åŠçããããã®ããŒã«ãæäŸããããšã
ã³ãã¥ããã£ã®ãããã§ãç§ãã¡ã¯å€èšèªããã³æ¹èšã®NLPããŒã¿ã»ãããæ°çŸè¿œå ããŸããïŒ ð€ â€ïž
ããããããã¹ãããŒã¿ã»ããã¯å§ãŸãã«éããŸãããããŒã¿ã¯ðµ é³å£°ãðž ç»åãé³å£°ãšããã¹ãã®çµã¿åãããç»åãšããã¹ããªã©ãããè±ããªåœ¢åŒã§è¡šçŸãããŠããŸãããããã®ããŒã¿ã»ããã§ãã¬ãŒãã³ã°ãããã¢ãã«ã¯ãç»åã®å 容ã説æããããç»åã«é¢ãã質åã«çããããããªã©ãçŽ æŽãããã¢ããªã±ãŒã·ã§ã³ãå¯èœã«ããŸãã
ð€ DatasetsããŒã ã¯ããããã®ããŒã¿ã»ããã¿ã€ããšã®äœæ¥ãã§ããã ãç°¡åã«ããããã®ããŒã«ãšæ©èœãéçºããŠããŸãããé³å£°ããã³ç»åããŒã¿ã»ããã®èªã¿èŸŒã¿ãšåŠçã«ã€ããŠã®è©³çŽ°ãåŠã¶ããã®æ°ããããã¥ã¡ã³ããè¿œå ããŸããã
- ããŒã¿ã»ãããšã¢ãã«ã«ãããDOIïŒããžã¿ã«ãªããžã§ã¯ãèå¥åïŒã®çŽ¹ä»
- ãã¢ã¢ãŒãã£ãã¯æå·åã«ããæå·åããŒã¿ã®ææ åæ
- ãªãŒãã£ãªããŒã¿ã»ããã®å®å šã¬ã€ã
ã¯ã€ãã¯ã¹ã¿ãŒã
ã¯ã€ãã¯ã¹ã¿ãŒãã¯ãã©ã€ãã©ãªã®æ©èœã«ã€ããŠã®èŠç¹ãææ¡ããããã«æ°ãããŠãŒã¶ãŒãæåã«èšªããå Žæã®äžã€ã§ãããã®ãããã¯ã€ãã¯ã¹ã¿ãŒããæŽæ°ããŠãð€ Datasetsã䜿çšããŠé³å£°ããã³ç»åããŒã¿ã»ãããåŠçããæ¹æ³ãå«ããŸãããäœæ¥ãããããŒã¿ã»ããã®åœ¢æ ãéžæããããŒã¿ã»ãããèªã¿èŸŒãã§åŠçããPyTorchãŸãã¯TensorFlowã§ãã¬ãŒãã³ã°ã«äœ¿çšããæºåãã§ãããŸã§ã®ãšã³ãããŒãšã³ãã®äŸãåç §ããŠãã ããã
ã¯ã€ãã¯ã¹ã¿ãŒãã«ã¯ãæ°ããto_tf_dataset
é¢æ°ãè¿œå ãããŠããŸãããã®é¢æ°ã¯ãããŒã¿ã»ãããtf.data.Dataset
ã«å€æããããã«å¿
èŠãªã³ãŒããèªåçã«èšè¿°ããŸããããã«ãããããŒã¿ã»ããããã·ã£ããã«ããŠããããèªã¿èŸŒãããã®ã³ãŒããæžãå¿
èŠããªããªããŸããããŒã¿ã»ãããtf.data.Dataset
ã«å€æããåŸã¯ãéåžžã®TensorFlowãŸãã¯Kerasã®ã¡ãœããã§ã¢ãã«ããã¬ãŒãã³ã°ããããšãã§ããŸãã
ä»æ¥ã¯ã¯ã€ãã¯ã¹ã¿ãŒãããã§ãã¯ããŠãããŸããŸãªããŒã¿ã»ãã圢æ
ã§ã®äœæ¥æ¹æ³ãåŠã³ãæ°ããto_tf_dataset
é¢æ°ãè©ŠããŠã¿ãŸãããïŒ
å°çšã¬ã€ã
åããŒã¿ã»ãã圢æ
ã«ã¯ãããããèªã¿èŸŒãã§åŠçããæ¹æ³ã«åºæã®ãã¥ã¢ã³ã¹ããããŸããäŸãã°ãé³å£°ããŒã¿ã»ãããèªã¿èŸŒãå Žåãé³å£°ä¿¡å·ã¯Audio
æ©èœã«ãã£ãŠèªåçã«ãã³ãŒãããã³ãªãµã³ããªã³ã°ãããŸããããã¯ããã¹ãããŒã¿ã»ãããèªã¿èŸŒãå Žåãšã¯ããªãç°ãªããŸãïŒ
ã¢ããªãã£åºæã®ããã¥ã¡ã³ããããèŠã€ããããããããã«ãåã¢ããªãã£ããšã«å°çšã®ã»ã¯ã·ã§ã³ãæ°ãã«èšããããåã¢ããªãã£ã®èªã¿èŸŒã¿ãšåŠçæ¹æ³ã瀺ãã¬ã€ããæäŸãããŠããŸããããŒã¿ã»ãã圢æ ã§ã®äœæ¥ã«é¢ããç¹å®ã®æ å ±ãæ¢ããŠããå Žåã¯ããŸããããã®å°çšã»ã¯ã·ã§ã³ãã芧ãã ãããäžæ¹ã§ãç¹å®ã§ã¯ãªãåºã䜿çšã§ããé¢æ°ã¯äžè¬çãªäœ¿çšæ¹æ³ã®ã»ã¯ã·ã§ã³ã«èšè¿°ãããŠããŸãããã®ãããªæ¹æ³ã§ããã¥ã¡ã³ããåç·šæããããšã§ãå°æ¥ãµããŒãããäºå®ã®ä»ã®ããŒã¿ã»ãã圢åŒã«ãããã¹ã±ãŒã©ãã«ã«å¯Ÿå¿ã§ããããã«ãªããŸãã
å°çšã¬ã€ãããã§ãã¯ããŠãããŸããŸãªã¢ããªãã£ã®ããŒã¿ã»ããã®èªã¿èŸŒã¿ãšåŠçã«ã€ããŠãã£ãšåŠãã§ã¿ãŠãã ããã
ImageFolder
éåžžãð€ Datasetsã®ãŠãŒã¶ãŒã¯ãããŒã¿ã»ããã®ããŠã³ããŒããšçæã«é¢ããã¹ã¯ãªãããäœæããŠãé©åãªtrain
ãštest
ã®åå²ã§ããŒã¿ã»ãããããŠã³ããŒãããã³çæããŸããããããImageFolder
ããŒã¿ã»ãããã«ããŒã䜿çšãããšãç»åããŒã¿ã»ãããããŠã³ããŒãããã³çæããããã®ã³ãŒããèšè¿°ããå¿
èŠã¯ãããŸãããç»ååé¡ã®ããã®ç»åããŒã¿ã»ãããèªã¿èŸŒãããšã¯ãããŒã¿ã»ããã以äžã®ããã«ãã©ã«ãã«æŽçãããŠããããšã確èªããã ãã§ç°¡åã§ãïŒ
folder/train/dog/golden_retriever.png
folder/train/dog/german_shepherd.png
folder/train/dog/chihuahua.png
folder/train/cat/maine_coon.png
folder/train/cat/bengal.png
folder/train/cat/birman.png
ç»åã®ã©ãã«ã¯ããã£ã¬ã¯ããªåã«åºã¥ããŠlabel
åã§çæãããŸãã ImageFolder
ã¯ãããŒã¿ã»ããã®èªã¿èŸŒã¿ã¹ã¯ãªãããæžãããã«å¿
èŠãªæéãšåŽåãçããŠãããã«ç»åããŒã¿ã»ãããå©çšã§ããããã«ããŸãã
ããããããã«çŽ æŽãããããšããããŸãïŒç»åããŒã¿ã»ããã«é¢ããã¡ã¿ããŒã¿ãå«ããã¡ã€ã«ãããå ŽåãImageFolder
ã¯ç»åãã£ãã·ã§ãã³ã°ãç©äœæ€åºãªã©ã®ä»ã®ç»åã¿ã¹ã¯ã«ã䜿çšã§ããŸããããšãã°ãç©äœæ€åºããŒã¿ã»ããã«ã¯äžè¬çã«ããŠã³ãã£ã³ã°ããã¯ã¹ãšåŒã°ããããªããžã§ã¯ãã®äœçœ®ã瀺ãç»åå
ã®åº§æšããããŸããImageFolder
ã¯ãã®ãã¡ã€ã«ã䜿çšããŠãåç»åã®ããŠã³ãã£ã³ã°ããã¯ã¹ãšã«ããŽãªã«é¢ããã¡ã¿ããŒã¿ããã©ã«ãå
ã®å¯Ÿå¿ããç»åã«ãªã³ã¯ããããšãã§ããŸãïŒ
{"file_name": "0001.png", "objects": {"bbox": [[302.0, 109.0, 73.0, 52.0]], "categories": [0]}}
{"file_name": "0002.png", "objects": {"bbox": [[810.0, 100.0, 57.0, 28.0]], "categories": [1]}}
{"file_name": "0003.png", "objects": {"bbox": [[160.0, 31.0, 248.0, 616.0], [741.0, 68.0, 202.0, 401.0]], "categories": [2, 2]}}
dataset = load_dataset("imagefolder", data_dir="/path/to/folder", split="train")
dataset[0]["objects"]
{"bbox": [[302.0, 109.0, 73.0, 52.0]], "categories": [0]}
å¿
èŠãªæ
å ±ãå«ãã¡ã¿ããŒã¿ãã¡ã€ã«ãããå ŽåãImageFolder
ã䜿çšããŠã»ãšãã©ã®ç»åã¿ã¹ã¯ã®ããã®ç»åããŒã¿ã»ãããèªã¿èŸŒãããšãã§ããŸãã詳现ã«ã€ããŠã¯ãImageFolderã®ã¬ã€ããã芧ãã ããã
次ã¯äœã§ããïŒ
ð€ Datasetsã©ã€ãã©ãªã®æåã®ã€ãã¬ãŒã·ã§ã³ãããã¹ãããŒã¿ã»ãããæšæºåããããŠã³ããŒããåŠçãéåžžã«ç°¡åã«ãªã£ãããã«ãé³å£°ãšç»åã®ããŒã¿ã»ããã«ãåãã¬ãã«ã®ãŠãŒã¶ãŒãã¬ã³ããªãŒããããããããšã«éåžžã«è奮ããŠããŸããããã«ããããŠãŒã¶ãŒã¯ããŸããŸãªã¢ããªãã£ã暪æããŠã¢ãã«ãã¢ããªã±ãŒã·ã§ã³ããã¬ãŒãã³ã°ãæ§ç¯ãè©äŸ¡ããã®ãããç°¡åã«ãªãããšãæåŸ ããŠããŸãã
ä»åŸæ°ã¶æéãé³å£°ãšç»åã®ããŒã¿ã»ãããæ±ãããã®æ°æ©èœãããŒã«ãè¿œå ãç¶ããŸããð€ Hugging Faceã®æ
å ±çã«ãããšãè¿ã
AudioFolder
ãšãããã®ãç»å Žããäºå®ã§ãïŒð€«Â ãåŸ
ã¡ããã ãéããªãŒãã£ãªåŠçã®ã¬ã€ããåç
§ããŠãGigaSpeechã®ãããªãªãŒãã£ãªããŒã¿ã»ããã§å®éã«æãåãããŠã¿ãŠãã ããã
é³å£°ãšç»åã®ããŒã¿ã»ããã«é¢ãã質åããã£ãŒãããã¯ã¯ãã©ãŒã©ã ã«åå ããŠãã ããããã°ãèŠã€ããå Žåã¯ãGitHubã®IssueãéããŠãã ãããããããã°å¯Ÿå¿ããããŸãã
ããå°ãåéºå¿ãæã¡ããã§ããïŒHubäžã®æé·ããã³ãã¥ããã£é§åã®é³å£°ãšç»åã®ããŒã¿ã»ããã³ã¬ã¯ã·ã§ã³ã«è²¢ç®ããŠãã ããïŒHubäžã«ããŒã¿ã»ãããªããžããªãäœæããããŒã¿ã»ãããã¢ããããŒãããŠãã ãããæå©ããå¿ èŠãªå Žåã¯ããªããžããªã®Communityã¿ãã§ãã£ã¹ã«ãã·ã§ã³ãéå§ããð€Â DatasetsããŒã ã®ã¡ã³ããŒã®äžäººã«é£çµ¡ããŠããã£ãŠãã ãããå®äºãŸã§ãæäŒãããããŸãïŒ
We will continue to update VoAGI; if you have any questions or suggestions, please contact us!
Was this article helpful?
93 out of 132 found this helpful
Related articles
- Hugging FaceããŒã¿ã»ãããšãã©ã³ã¹ãã©ãŒããŒã䜿çšããç»åã®é¡äŒŒæ§
- Hugging Faceããã©ã³ã¹ã®ããŒã¿ä¿è·æ©é¢ã®åŒ·åãµããŒãããã°ã©ã ã«éžã°ããŸãã
- DuckDB Hugging Face Hubã«ä¿åãããŠãã50,000以äžã®ããŒã¿ã»ãããåæãã
- åºç€ã¢ãã«ã¯äººéã®ããã«ããŒã¿ã«ã©ãã«ãä»ããããšãã§ããŸããïŒ
- ããžãã¹æŠç¥ã«ãããŠæ©æ¢°åŠç¿ã䜿çšããæãšäœ¿çšããªãæã®éžæ
- SQLã¯ãšãªã«ãããŠGPT-4ãããåªãããã®ïŒNSQLïŒå®å šãªãªãŒãã³ãœãŒã¹ïŒ
- ãã³ãã®åã解æŸããïŒ.locãš.ilocã®æ·±ããã€ã