𧚠JAX / Flax ã§ã®å®å®ããæ¡æ£ïŒ
'𧚠Stable diffusion with JAX/Flax!'
ð€ Hugging Face Diffusersã¯ããŒãžã§ã³0.5.1
ããFlaxããµããŒãããŠããŸãïŒããã«ãããColabãKaggleããŸãã¯Google Cloud Platformãªã©ã®Google TPUäžã§ã®è¶
é«éãªæšè«ãå¯èœã«ãªããŸãã
ãã®æçš¿ã§ã¯ãJAX / Flaxã䜿çšããŠæšè«ãå®è¡ããæ¹æ³ã瀺ããŸããStable Diffusionã®åäœè©³çŽ°ãGPUã§ã®å®è¡æ¹æ³ã«ã€ããŠè©³çŽ°ãç¥ãããå Žåã¯ããã®ColabããŒãããã¯ãåç §ããŠãã ããã
äžç·ã«é²ããå Žåã¯ãäžã®ãã¿ã³ãã¯ãªãã¯ããŠãã®æçš¿ãColabããŒãããã¯ãšããŠéããŸãã
- ãã®ã³ã°ãã§ã€ã¹æšè«ãšã³ããã€ã³ãã®å§ãæ¹
- MTEB 倧èŠæš¡ããã¹ãåã蟌ã¿ãã³ãããŒã¯
- PyTorch DDPããAccelerateãžããããŠTrainerãžç°¡åã«åæ£ãã¬ãŒãã³ã°ããã¹ã¿ãŒããŸããã
ãŸããTPUããã¯ãšã³ãã䜿çšããŠããããšã確èªããŠãã ããããã®ããŒãããã¯ãColabã§å®è¡ããŠããå Žåã¯ãäžã®ã¡ãã¥ãŒã§ã©ã³ã¿ã€ã
ãéžæãããã©ã³ã¿ã€ã ã®ã¿ã€ããå€æŽããªãã·ã§ã³ãéžæããããŒããŠã§ã¢ã¢ã¯ã»ã©ã¬ãŒã¿
ã®èšå®ã§TPU
ãéžæããŸãã
JAXã¯TPUã«éå®ãããŠããããã§ã¯ãããŸããããTPUãµãŒããŒããšã«8ã€ã®TPUã¢ã¯ã»ã©ã¬ãŒã¿ã䞊åã«åäœããããããã®ããŒããŠã§ã¢äžã§èŒããŸãã
ã»ããã¢ãã
import jax
num_devices = jax.device_count()
device_type = jax.devices()[0].device_kind
print(f"Found {num_devices} JAX devices of type {device_type}.")
assert "TPU" in device_type, "Available device is not a TPU, please select TPU from Edit > Notebook settings > Hardware accelerator"
åºåïŒ
Found 8 JAX devices of type TPU v2.
diffusers
ãã€ã³ã¹ããŒã«ãããŠããããšã確èªããŠãã ããã
!pip install diffusers==0.5.1
次ã«ããã¹ãŠã®äŸåé¢ä¿ãã€ã³ããŒãããŸãã
import numpy as np
import jax
import jax.numpy as jnp
from pathlib import Path
from jax import pmap
from flax.jax_utils import replicate
from flax.training.common_utils import shard
from PIL import Image
from huggingface_hub import notebook_login
from diffusers import FlaxStableDiffusionPipeline
ã¢ãã«ã®èªã¿èŸŒã¿
ã¢ãã«ã䜿çšããåã«ãã¢ãã«ã®ã©ã€ã»ã³ã¹ãæ¿è«ŸããŠéã¿ãããŠã³ããŒãã䜿çšããå¿ èŠããããŸãã
ã©ã€ã»ã³ã¹ã¯ããã®ãããªåŒ·åãªæ©æ¢°åŠç¿ã·ã¹ãã ã®æœåšçãªæ害ãªåœ±é¿ã軜æžããããã«èšèšãããŠããŸãããŠãŒã¶ãŒã«å¯ŸããŠã©ã€ã»ã³ã¹ã®å šæã泚ææ·±ãèªãã§ããã ããããé¡ãããŸãã以äžã¯èŠçŽã§ãïŒ
- ã¢ãã«ãæå³çã«éæ³ãŸãã¯æ害ãªåºåãã³ã³ãã³ããçæãŸãã¯å ±æããããã«äœ¿çšããããšã¯ã§ããŸããã
- çæããåºåã«é¢ããŠãç§ãã¡ã¯æš©å©ã䞻匵ããŸãããããããèªç±ã«äœ¿çšããããšãã§ãã䜿çšã«é¢ããŠã¯ã©ã€ã»ã³ã¹ã§èšå®ãããèŠå®ã«éåããªãããã«è²¬ä»»ãæã€å¿ èŠããããŸãã
- éã¿ãåé åžããã¢ãã«ãåæ¥çã«ããã³/ãŸãã¯ãµãŒãã¹ãšããŠäœ¿çšããããšãã§ããŸãããã ãããã®å Žåãã©ã€ã»ã³ã¹ã®äœ¿çšå¶éãšCreativeML OpenRAIL-Mã®ã³ããŒããã¹ãŠã®ãŠãŒã¶ãŒã«å ±æããå¿ èŠããããŸãã
Flaxã®éã¿ã¯Stable Diffusionãªããžããªã®äžéšãšããŠHugging Face Hubã§å©çšã§ããŸããStable Diffusionã¢ãã«ã¯CreateML OpenRail-Mã©ã€ã»ã³ã¹ã®äžã§é åžãããŠããŸãããã®ãªãŒãã³ã©ã€ã»ã³ã¹ã¯ãçæããåºåã«é¢ããŠæš©å©ã䞻匵ãããéæ³ãŸãã¯æ害ãªã³ã³ãã³ããæå³çã«çæããããšãçŠæ¢ããŠããŸããã¢ãã«ã«ãŒãã«ã¯è©³çŽ°ãèšèŒãããŠãããããã©ã€ã»ã³ã¹ãæ¿è«Ÿãããã©ãããæ éã«æ€èšããèªãã§ãã ãããæ¿è«Ÿããå Žåã¯ãHubã®ç»é²ãŠãŒã¶ãŒã§ãããã³ãŒããæ©èœããããã®ã¢ã¯ã»ã¹ããŒã¯ã³ã䜿çšããå¿ èŠããããŸããã¢ã¯ã»ã¹ããŒã¯ã³ãæäŸããã«ã¯ã次ã®2ã€ã®ãªãã·ã§ã³ããããŸãïŒ
- ã¿ãŒããã«ã§
huggingface-cli login
ã³ãã³ãã©ã€ã³ããŒã«ã䜿çšããããã³ããã«ããŒã¯ã³ã貌ãä»ããŸããããŒã¯ã³ã¯ã³ã³ãã¥ãŒã¿ã«ãã¡ã€ã«ãšããŠä¿åãããŸãã - ãŸãã¯ãããŒãããã¯ã§
notebook_login()
ã䜿çšããŸããããã¯åãããšãè¡ããŸãã
次ã®ã»ã«ã¯ããã®ã³ã³ãã¥ãŒã¿ã§æ¢ã«èªèšŒæžã¿ã§ãªãéãããã°ã€ã³ã€ã³ã¿ãŒãã§ãŒã¹ã衚瀺ããŸããã¢ã¯ã»ã¹ããŒã¯ã³ã貌ãä»ããå¿ èŠããããŸãã
if not (Path.home()/'.huggingface'/'token').exists(): notebook_login()
TPUããã€ã¹ã¯bfloat16
ãå¹ççãªããŒããããŒãã¿ã€ãããµããŒãããŠããŸãããã¹ãã«äœ¿çšããŸããã代ããã«å®å
šãªç²ŸåºŠãæã€float32
ã䜿çšããããšãã§ããŸãã
dtype = jnp.bfloat16
Flaxã¯é¢æ°åã®ãã¬ãŒã ã¯ãŒã¯ãªã®ã§ãã¢ãã«ã¯ç¶æ
ãæããããã©ã¡ãŒã¿ã¯ãããã®å€éšã«ä¿åãããŸããäºååŠç¿ãããFlaxãã€ãã©ã€ã³ãããŒããããšããã€ãã©ã€ã³èªäœãšã¢ãã«ã®éã¿ïŒãŸãã¯ãã©ã¡ãŒã¿ïŒã®äž¡æ¹ãè¿ãããŸããç§ãã¡ã¯éã¿ã®bf16
ããŒãžã§ã³ã䜿çšããŠãããããã«ããåã®èŠåãçºçããŸãããå®å
šã«ç¡èŠã§ããŸãã
pipeline, params = FlaxStableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
revision="bf16",
dtype=dtype,
)
æšè«
éåžžãTPUã¯8ã€ã®ããã€ã¹ã䞊åã«åäœããŠãããããããã³ãããããã€ã¹ã®æ°ã ãè€è£œããŸãããã®åŸã8ã€ã®ããã€ã¹ã§åæã«æšè«ãè¡ããåããã€ã¹ã1ã€ã®ç»åãçæãã責任ãæã¡ãŸãããããã£ãŠã1ã€ã®ãããã1ã€ã®ç»åãçæããã®ã«ãããæéãšåãæéã§ã8ã€ã®ç»åãååŸããããšãã§ããŸãã
ããã³ãããè€è£œããåŸããã€ãã©ã€ã³ã®prepare_inputs
é¢æ°ãåŒã³åºãããšã§ãããŒã¯ã³åãããããã¹ãã®IDãååŸããŸããããŒã¯ã³åãããããã¹ãã®é·ãã¯ãåºç€ãšãªãCLIPããã¹ãã¢ãã«ã®èšå®ã«ãã£ãŠ77ããŒã¯ã³ã«èšå®ãããŠããŸãã
prompt = "A cinematic film still of Morgan Freeman starring as Jimi Hendrix, portrait, 40mm lens, shallow depth of field, close up, split lighting, cinematic"
prompt = [prompt] * jax.device_count()
prompt_ids = pipeline.prepare_inputs(prompt)
prompt_ids.shape
åºå :
(8, 77)
è€è£œãšäžŠåå
ã¢ãã«ã®ãã©ã¡ãŒã¿ãšå
¥åã¯ã8ã€ã®äžŠåããã€ã¹ã«è€è£œããå¿
èŠããããŸãããã©ã¡ãŒã¿èŸæžã¯flax.jax_utils.replicate
ã䜿çšããŠè€è£œãããèŸæžããã©ããŒã¹ããŠéã¿ã®åœ¢ç¶ã8åç¹°ãè¿ãããã«å€æŽããŸããé
åã¯shard
ã䜿çšããŠè€è£œãããŸãã
p_params = replicate(params)
prompt_ids = shard(prompt_ids)
prompt_ids.shape
åºå :
(8, 1, 77)
ãã®åœ¢ç¶ã¯ã8ã€ã®ããã€ã¹ã®ãããããã圢ç¶ã(1, 77)
ã®jnp
é
åãå
¥åãšããŠåãåãããšãæå³ããŠããŸãããããã£ãŠã1
ã¯ããã€ã¹ããšã®ããããµã€ãºã§ããã¡ã¢ãªãååã«ããTPUã§ã¯ã1ã€ã®ãããã§è€æ°ã®ç»åïŒãããããšïŒãçæãããå Žåãããããµã€ãºã¯1
ããã倧ãããªãå¯èœæ§ããããŸãã
ç»åãçæããæºåãã»ãŒæŽããŸããïŒç»åçæé¢æ°ã«æž¡ãããã®ã©ã³ãã ãªæ°å€ãžã§ãã¬ãŒã¿ãäœæããå¿ èŠããããŸããããã¯Flaxã®æšæºçãªæç¶ãã§ãããã©ã³ãã ãªæ°å€ã«é¢é£ãããã¹ãŠã®é¢æ°ã¯ãžã§ãã¬ãŒã¿ãåãåãããšãæåŸ ãããŠããŸããããã«ãããè€æ°ã®åæ£ããã€ã¹ã§ãã¬ãŒãã³ã°ããŠããå Žåã§ãåçŸæ§ã確ä¿ãããŸãã
以äžã®ãã«ããŒé¢æ°ã¯ãã·ãŒãã䜿çšããŠã©ã³ãã ãªæ°å€ãžã§ãã¬ãŒã¿ãåæåããŸããåãã·ãŒãã䜿çšããã°ããŸã£ããåãçµæãåŸãããšãã§ããŸããåŸã§ããŒãããã¯ã§çµæã調ã¹ãéã«ã¯ãç°ãªãã·ãŒãã䜿çšããŠãæ§ããŸããã
def create_key(seed=0):
return jax.random.PRNGKey(seed)
ãžã§ãã¬ãŒã¿ãååŸããããã8åãåå²ãããŠåããã€ã¹ãç°ãªããžã§ãã¬ãŒã¿ãåãåãããã«ããŸãããããã£ãŠãåããã€ã¹ã¯ç°ãªãç»åãäœæããå šäœã®ããã»ã¹ã¯åçŸå¯èœã§ãã
rng = create_key(0)
rng = jax.random.split(rng, jax.device_count())
JAXã®ã³ãŒãã¯ãéåžžã«é«éã«å®è¡ãããå¹ççãªè¡šçŸã«ã³ã³ãã€ã«ã§ããŸãããã ããåŸç¶ã®åŒã³åºãã§ãã¹ãŠã®å ¥åãåã圢ç¶ã§ããããšã確èªããå¿ èŠããããŸããããã§ãªãå ŽåãJAXã¯ã³ãŒããåã³ã³ãã€ã«ããå¿ èŠããããæé©åãããé床ã掻çšããããšãã§ããŸããã
Flaxãã€ãã©ã€ã³ã¯ãåŒæ°ãšããŠjit = True
ãæž¡ããšãã³ãŒããã³ã³ãã€ã«ããŠãããŸãããŸããã¢ãã«ã8ã€ã®å©çšå¯èœãªããã€ã¹ã§äžŠåã«å®è¡ãããããã«ãããŸãã
次ã®ã»ã«ãå®è¡ããã®ã¯æåã®äžåã ãã§ãã³ã³ãã€ã«ã«ã¯æéãããããŸããããã以éã®åŒã³åºãïŒç°ãªãå
¥åã§ãïŒã¯ã¯ããã«éããªããŸããäŸãã°ãç§ããã¹ãããTPU v2-8ã§ã¯ãã³ã³ãã€ã«ã«1å以äžããããŸãããããã®åŸã®æšè«å®è¡ã«ã¯çŽ7ç§
ããããŸãã
images = pipeline(prompt_ids, p_params, rng, jit=True)[0]
åºå :
CPU æé: ãŠãŒã¶ãŒ 464 msãã·ã¹ãã : 105 msãåèš: 569 ms
ãŠã©ãŒã«ã¿ã€ã : 7.07 s
è¿ãããé
åã®åœ¢ç¶ã¯ (8, 1, 512, 512, 3)
ã§ãã2çªç®ã®æ¬¡å
ãåãé€ããŠã512 à 512 à 3
ã®8ã€ã®ç»åãååŸããããããPIL圢åŒã«å€æããŸãã
images = images.reshape((images.shape[0],) + images.shape[-3:])
images = pipeline.numpy_to_pil(images)
å¯èŠå
ç»åãã°ãªããç¶ã«è¡šç€ºããããã®ãã«ããŒé¢æ°ãäœæããŸãããã
def image_grid(imgs, rows, cols):
w,h = imgs[0].size
grid = Image.new('RGB', size=(cols*w, rows*h))
for i, img in enumerate(imgs): grid.paste(img, box=(i%cols*w, i//cols*h))
return grid
image_grid(images, 2, 4)
ç°ãªãããã³ããã®äœ¿çš
ãã¹ãŠã®ããã€ã¹ã§åãããã³ãããè€è£œããå¿ èŠã¯ãããŸãããã©ããªããšã§ãã§ããŸãïŒ2ã€ã®ããã³ããã4åçæããããŸãã¯äžåºŠã«8ã€ã®ç°ãªãããã³ãããçæããããšããã§ããŸããããããã£ãŠã¿ãŸãããïŒ
ãŸããå ¥åã®æºåã³ãŒãã䟿å©ãªé¢æ°ã«ãªãã¡ã¯ã¿ãªã³ã°ããŸãïŒ
prompts = [
"Labrador in the style of Hokusai",
"Painting of a squirrel skating in New York",
"HAL-9000 in the style of Van Gogh",
"Times Square under water, with fish and a dolphin swimming around",
"Ancient Roman fresco showing a man working on his laptop",
"Close-up photograph of young black woman against urban background, high quality, bokeh",
"Armchair in the shape of an avocado",
"Clown astronaut in space, with Earth in the background",
]
prompt_ids = pipeline.prepare_inputs(prompts)
prompt_ids = shard(prompt_ids)
images = pipeline(prompt_ids, p_params, rng, jit=True).images
images = images.reshape((images.shape[0], ) + images.shape[-3:])
images = pipeline.numpy_to_pil(images)
image_grid(images, 2, 4)
䞊ååã¯ã©ã®ããã«æ©èœããŸããïŒ
以åã«è¿°ã¹ãããã«ãdiffusers
Flaxãã€ãã©ã€ã³ã¯ã¢ãã«ãèªåçã«ã³ã³ãã€ã«ããå©çšå¯èœãªãã¹ãŠã®ããã€ã¹ã§äžŠåã«å®è¡ããŸããããã§ã¯ãã®ããã»ã¹ã®å
éšãç°¡åã«èŠãŠããããã©ã®ããã«æ©èœãããã瀺ããŸãã
JAXã®äžŠååã¯è€æ°ã®æ¹æ³ã§è¡ãããšãã§ããŸãããã£ãšãç°¡åãªæ¹æ³ã¯ãjax.pmap
é¢æ°ã䜿çšããŠåäžããã°ã©ã ãè€æ°ããŒã¿ïŒSPMDïŒäžŠååãå®çŸããããšã§ããããã¯ãåãã³ãŒãã®è€æ°ã®ã³ããŒãç°ãªãããŒã¿å
¥åã§å®è¡ããããšãæå³ããŸããããé«åºŠãªã¢ãããŒããå¯èœã§ãããèå³ãããå Žåã¯JAXã®ããã¥ã¡ã³ããšpjit
ã®ããŒãžãåç
§ããŠãã®ãããã¯ãæ¢çŽ¢ããããšããå§ãããŸãã
jax.pmap
ã¯æ¬¡ã®2ã€ã®ããšãè¡ããŸãïŒ
- ã³ãŒããã³ã³ãã€ã«ïŒãŸãã¯
jit
ããïŒããšãããã¯pmap
ãåŒã³åºãããšãã«ã¯è¡ãããŸããããæåã«pmappedé¢æ°ãåŒã³åºããããšãã«è¡ãããŸãã - ã³ã³ãã€ã«ãããã³ãŒãããã¹ãŠã®å©çšå¯èœãªããã€ã¹ã§äžŠåã«å®è¡ãããããã«ããŸãã
ãããã©ã®ããã«æ©èœãããã瀺ãããã«ããã€ãã©ã€ã³ã®_generate
ã¡ãœãããpmap
ããŸããããã¯ãç»åãçæãããã©ã€ããŒãã¡ãœããã§ãã泚æããŠãã ããããã®ã¡ãœããã¯å°æ¥ã®diffusers
ã®ãªãªãŒã¹ã§ååãå€æŽããããåé€ãããå¯èœæ§ããããŸãã
p_generate = pmap(pipeline._generate)
pmap
ã䜿çšããåŸãæºåãããé¢æ°p_generate
ã¯æŠå¿µçã«æ¬¡ã®ããšãè¡ããŸãïŒ
- åããã€ã¹ã§åºç€ãšãªãé¢æ°
pipeline._generate
ã®ã³ããŒãåŒã³åºããŸãã - åããã€ã¹ã«ç°ãªãéšåã®å
¥ååŒæ°ãéä¿¡ããŸããããã«ã¯ã·ã£ãŒãã£ã³ã°ã䜿çšãããŸãããã®äŸã§ã¯ã
prompt_ids
ã®åœ¢ç¶ã¯(8, 1, 77, 768)
ã§ãããã®é åã¯8
ã«åå²ãããå_generate
ã®ã³ããŒã¯åœ¢ç¶(1, 77, 768)
ã®å ¥åãåãåããŸãã
ç§ãã¡ã¯ã_generate
ã䞊åã§åŒã³åºãããããšãç¡èŠããŠå®å
šã«ã³ãŒãåããããšãã§ããŸãããã®äŸã§ã¯ãããããµã€ãºïŒ1
ïŒãšã³ãŒãã«æå³ããã次å
ã«é¢å¿ãæã¡ã䞊åã§åäœãããããã«äœãå€æŽããå¿
èŠã¯ãããŸããã
ãã€ãã©ã€ã³åŒã³åºãã䜿çšããå Žåãšåæ§ã«ãæåã«ä»¥äžã®ã»ã«ãå®è¡ãããšæéãããããŸããããã®åŸã¯ã¯ããã«é«éã«ãªããŸãã
images = p_generate(prompt_ids, p_params, rng)
images = images.block_until_ready()
images.shape
åºåïŒ
CPU times: user 118 ms, sys: 83.9 ms, total: 202 ms
Wall time: 6.82 s
(8, 1, 512, 512, 3)
JAXã¯éåæãã£ã¹ãããã䜿çšããã§ããã ãæ©ãPythonã«ãŒãã«å¶åŸ¡ãè¿ããããæšè«æéãæ£ãã枬å®ããããã«block_until_ready()
ã䜿çšããŠããŸããã³ãŒãã§ããã䜿çšããå¿
èŠã¯ãããŸããããŸã å
·çŸåãããŠããªãèšç®ã®çµæã䜿çšããå Žåã«ã¯ãèªåçã«ããããã³ã°ãçºçããŸãã
We will continue to update VoAGI; if you have any questions or suggestions, please contact us!
Was this article helpful?
93 out of 132 found this helpful
Related articles
- ð€ Optimum IntelãšOpenVINOã§ã¢ãã«ãé«éåããŸããã
- ãã«ããªã³ã¬ã«ASRã®ããã®Whisperã®èª¿æŽãè¡ããŸã with ð€ Transformers
- Diffusersã䜿çšããDreamboothã«ããå®å®ããæ¡æ£ã®ãã¬ãŒãã³ã°
- æ°ããäŸ¡æ Œèšå®ãã玹ä»ããŸã
- ãã©ã³ã¹ãã©ãŒããŒã«ããã察æ¯çæ¢çŽ¢ãçšãã人éã¬ãã«ã®ããã¹ãçæ ð€
- Hugging Faceã«ãããæšè«ãœãªã¥ãŒã·ã§ã³ã®æŠèŠ
- ããã¥ã¡ã³ãAIã®å é