Copy-ready prompt
A charming horizontal educational infographic, presented in a hand-drawn scrapbook style with a soft background of desktop and paper collages, aims to explain the principles of Chinese word segmentation based on byte-level BPE. The image is divided into three clearly defined teaching areas, spanning the entire wide banner from left to right. A cute Shiba Inu mascot stands on the far left.{argument name="character name" default="Chai Xiaoqi"} With warm brown and cream fur, a round face, small triangular ears, rosy cheeks, and a curious expression, it stands beside a desk with drawers, a pencil, and a chair, holding a cup in its hand. Above the Shiba Inu is a bold, rounded white title box containing black Chinese characters:{argument name="headline text" default="Chinese word segmentation: Byte-level BPE (BBPE) process popular science"} In the first teaching area near the top, four translucent blue token-shaped cubes are displayed on a wooden shelf. Each cube is labeled "Token" and accompanied by a curved arrow and a handwritten Chinese annotation, "Word Frequency Corpus Statistics," pointing to the next step. In the second area, a large magnifying glass highlights three frequency cubes labeled "E7," "94," and "B5." The area label is written on a yellow sticky note, stating "2. Frequency Statistics and Merging." Below and inside the magnified area is a large black Chinese character "电" (electricity), next to which is a handwritten annotation, "Frequent Byte Pairs." In the lower center of the third area, a wooden sign and a larger merged translucent blue token cube are added, labeled "Token," and accompanied by a yellow sticky note stating "3. Cross-Word Merging," with the large black Chinese characters "我们→" (we →), and below it is an explanatory note, "High-frequency words are combined and merged into tokens." On the far right, the final explanation is displayed: three small byte squares labeled "E7," "94," and "B5" sit above a large black Chinese character for "electricity," next to a note that reads "1. Byte-level encoding (UTF-8)," and below is a large black Chinese word for "we." Pink, blue, and green curved arrows connect the various areas to illustrate the flow. Near the bottom center is a blue token mascot with tiny limbs, a smiling face, and a wave. The design uses soft cream, pink, beige, and light blue, with thick, clear lines, sticker-like cutouts, washi tape corners, notebook textures, scattered pencil edges, and a friendly, journal-style illustration style suitable for educational charts.
Prompt breakdown
A charming horizontal educational infographic, presented in a hand-drawn scrapbook style with a soft background of desktop and paper collages, aims to explain the principles of Chinese word segmentation based on byte-level BPE.
The image is divided into three clearly defined teaching areas, spanning the entire wide banner from left to right.
A cute Shiba Inu mascot stands on the far left.{argument name="character name" default="Chai Xiaoqi"} With warm brown and cream fur, a round face, small triangular ears, rosy cheeks, and a curious expression, it stands beside a desk with drawers, a pencil, and a chair, holding a cup in its hand.
Above the Shiba Inu is a bold, rounded white title box containing black Chinese characters:{argument name="headline text" default="Chinese word segmentation: Byte-level BPE (BBPE) process popular science"} In the first teaching area near the top, four translucent blue token-shaped cubes are displayed on a wooden shelf.









