Resources
Academic
ZhoBLiMP: a systematic assessment of language models with linguistic minimal pairs in Chinese. ZhoBLiMP is a dataset that can be used to probe Chinese linguistic knowledge in language models, especially syntax. To validate ZhoBLiMP, we trained 5 Zh-Pythia models from scratch on original Chinese data.
- Dataset: ZhoBLiMP covers 118 paradigms in 15 high-level linguistic phenemena. It contains 35k minimal pairs that differ in a minimal way to demonstrate a single syntactic or semantic contrast.
- Models: The parameters of the Zh-Pythia models correspond to their English counterparts (size: 14M, 70M, 160M, 410M, 1.4B).
LanguageTesting: A pipeline for analyzing students’ performance in exams, which takes into account item facility, item discrimination, B-index,…
RL Logistics Simulator: we implemented a logistics simulator that handles express packages efficiently. The optimation of pricing and transportation strategy was realized with unsupervised reinforcement learning, where extreme scenarios were simulated multiple times before convergence.
Fun
(brewing)