Masato Mita, Ph.D.
R&D Engineer @ Data Technology Lab, Recruit Co.,Ltd.
Visiting Researcher @ Social Computational Linguistics Group, Hitotsubashi University
Visiting Researcher @ Natural Language Understanding Team, RIKEN AIP

- 名古屋地区NLPセミナーにて「心理言語学の視点から再考する言語モデルの学習過程」という題目で発表しました。
- Our two papers “Developmentally-plausible Working Memory Shapes a Critical Period for Language Acquisition” and “Targeted Syntactic Evaluation for Grammatical Error Correction” have been accepted at ACL 2025.
- 言語処理学会第31回年次大会(NLP2025)で主著論文「作業記憶の発達的特性が言語獲得の臨界期を形成する」が最優秀賞を受賞しました。
- Our paper “AdTEC: A Unified Benchmark for Evaluating Text Quality in Search Engine Advertising” has been accepted to NAACL 2024.
- Our two TACL papers will be presented at EMNLP2024 : one on a new meta-evaluation for GEC (Kobayashi+2024) and another on reducing reliance on shortcut prediction (Honda+2024).
- CADC2024にて「CAMERA-Suite: 広告文生成のための評価スイート」を発表しました。
- Our paper “DejaVu: Disambiguation evaluation dataset for English-JApanese machine translation on VisUal information” has been accepted to PACLIC 2024.
- Our paper “Revisiting the Evaluation for Chinese Grammatical Error Correction” has been accepted to Journal of Advanced Computational Intelligence and Intelligent Informatics.
- Our paper “Not Eliminate but Aggregate: Post-Hoc Control over Mixture-of-Experts to Address Shortcut Shifts in Natural Language Understanding.” has been accepted at TACL.
- Our paper “Striking Gold in Advertising: Standardization and Exploration of Ad Text Generation” has been accepted at ACL 2024.
Research Interest
- Natural Language Processing
- Grammatical Error Correction, Resournce and Evaluation
- Computational (Psycho)linguistics
- Language acquisition, Cognitive modeling
Publication (Featured)
Aoimi Koyama, Masato Mita, Su-Youn Yoon, Yasufumi Takama, Mamoru Komachi. “Targeted Syntactic Evaluation for Grammatical Error Correction.” Proceedings of the ACL 2025.
Masato Mita, Ryo Yoshida, Yohei Oseki. “Developmentally-plausible Working Memory Shapes a Critical Period for Language Acquisition.” Proceedings of the ACL 2025.
Peinan Zhang, Yusuke Sakai, Masato Mita, Hiroki Ouchi, Taro Watanabe. “AdTEC: A Unified Benchmark for Evaluating Text Quality in Search Engine Advertising.” Proceedings of the NAACL 2025.
Ukyo Honda, Tatsushi Oka, Peinan Zhang, Masato Mita. “Not Eliminate but Aggregate: Post-Hoc Control over Mixture-of-Experts to Address Shortcut Shifts in Natural Language Understanding.” Transactions of the Association for Computational Linguistics (TACL).
Masato Mita, Soichiro Murakami, Akihiko Kato, Peinan Zhang. “Striking Gold in Advertising: Standardization and Exploration of Ad Text Generation.” Proceedings of the ACL 2024.
Masamune Kobayashi, Masato Mita, Mamoru Komachi. “Revisiting Meta-evaluation for Grammatical Error Correction.” Transactions of the Association for Computational Linguistics (TACL).
Masato Mita, Keisuke Sakaguchi, Masato Hagiwara, Tomoya Mizumoto, Jun Suzuki, Kentaro Inui. “Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and Beyond.” Proceedings of the BEA 2024.
Masato Mita, Hitomi Yanaka. “Do Grammatical Error Correction Models Realize Grammatical Generalization?.” Findings of the ACL-IJCNLP 2021.
Masato Mita, Shun Kiyono, Masahiro Kaneko, Jun Suzuki, Kentaro Inui. “A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction.” Findings of the EMNLP 2020.
Masahiro Kaneko, Masato Mita, Shun Kiyono, Jun Suzuki, Kentaro Inui. “Can Encoder-decoder Models Benefit from Pre-trained Language Representation in Grammatical Error Correction?.” Proceedings of the ACL 2020.
Shun Kiyono, Jun Suzuki, Masato Mita, Tomoya Mizumoto, Kentaro Inui. “An Empirical Study of Incorporating Pseudo Data to Grammatical Error Correction.” Proceedings of EMNLP-IJCNLP 2019.
Masato Mita, Tomoya Mizumoto, Masahiro Kaneko, Ryo Nagata, Kentaro Inui. “Cross-Corpora Evaluation and Analysis of Grammatical Error Correction Models — Is Single-Corpus Evaluation Enough?.” Proceedings of the NAACL-HLT 2019.
[ᐧᐧᐧ] Please refer to a full list of Publications
MISC
最優秀賞, 言語処理学会第31回年次大会 (2025年3月) (1件/765件)
委員特別賞, 言語処理学会第31回年次大会 (2025年3月) (32件/726件)
委員特別賞, 言語処理学会第31回年次大会 (2025年3月) (32件/726件)
優秀賞, 言語処理学会第30回年次大会 (2024年3月)(12件/599件)
優秀研究賞, 情報処理学会第258回自然言語処理研究会 (2023年12月)
奨励賞, NLP若手の会 第18回シンポジウム (2023年8月)
委員特別賞, 言語処理学会第29回年次大会 (2023年3月)(26件/579件)
優秀賞, 言語処理学会第28回年次大会 (2022年3月)(7件/386件)
若手奨励賞, 言語処理学会第26回年次大会 (2020年3月)(13件/269件)
優秀賞, 言語処理学会第26回年次大会 (2020年3月)(6件/396件)
奨励賞, NLP若手の会 第14回シンポジウム (2019年8月)
萌芽研究賞, NLP若手の会 第14回シンポジウム (2019年8月)
奨励賞 (ハッカソン オープン部門), NLP若手の会 第9回シンポジウム (2014年9月)
心理言語学の視点から再考する言語モデルの学習過程. 名古屋地区NLPセミナー(2025.6).
“NLP2025 ワークショップ:LLM時代のことばの評価の現在と未来”. 自然言語処理 32巻2号. (2025.6)
“サイエンスと事業貢献の両立を目指して” 自然言語処理 31巻4号. (2024.12)
“国際会議ACL2024参加報告”. 第26回音声言語シンポジウム(SP/SLP) 兼 第11回自然言語処理シンポジウム(NL/NLC). (2024.12)
“CAMERA-Suite: 広告文生成のための評価スイート”」. CyberAgent Developer Conference (CADC2024). (2024.10) [動画][資料]
“【採択論文紹介】広告文生成タスクの既定とベンチマーク構築 (ACL2024)”. CyberAgent Blog.(2024.10)
“NLP2024 テーマセッション「人間と計算機のことばの評価」”. 自然言語処理 31巻2号. (2024.6)
“NLP2023 テーマセッション「ことばの評価と品質推定」.”. 自然言語処理 30巻2号. (2023.6)
“NLP2023参加報告.” NLP2023 参加報告会 presented by Money Forward Lab
“日本経済新聞「やっぱり変だよ、日本の教育4」”(2022年11月4日)
“企業にいながらアカデミアのように働く.” CyberAgent Blog.(2022.7)
“ライティング支援のための文法誤り訂正.” 株式会社NTTドコモ 招待講演.(2022.2)
“Do Grammatical Error Correction Models Realize Grammatical Generalization?.” 自然言語処理 28巻4号. (2021.12)
文法誤り訂正モデルの文法性評価と論述リライトタスクの提案. 第19回 NLP東京Dの会. (2021.3)
“ライティング学習支援のための文法誤り訂正技術の現状と今後の展望.” 教育アセスメント×言語処理シンポジウム : 自動採点、英文添削、論述評価の可能性.(2020.12)
文法誤り訂正の評価に対する問題提起. 第17回 NLP東京Dの会.(2019.6.)
“大規模言語モデルとは何か.” 現代化学2023年9月号.
“深層学習による自然言語処理の理論と実践.” Coloso.
- AdPsyche (NLP2025)
- Japanese preference dataset based on advertising psychology
- LCTG Bench(NLP2024)
- Benchmark to measure the controllability of Japanese LLMs in terms of how well they comply with constraints such as character count keywords in instructions
- CAMERA3 (LREC-COLING 2024)
- Evaluation dataset for controllable ad text generation in Japanese
- SEEDA (TACL 2024)
- Sentence-based and edit-based human evaluation dataset for GEC
- CAMERA (ACL 2024)
- Multimodal dataset for ad text generation in Japanese
- TETRA (BEA2024)
- Document revision corpus
- Document revision corpus
- JaLeCoN (BEA 2023)
- Dataset of Japanese lexical complexity for non-native readers
- ClozEx (EMNLP 2023, Findings)
- Dataset for a task of generation of English cloze explanation
- CELA (EACL 2023, Findings)
- Datast for cloze quality estimation
- ProQE (LREC 2022)
- Proficiency-wise quality estimation dataset
- 敬語変換タスクにおける評価用データセット (JSAI 2022)
- 日本語学習者支援のための敬語変換タスクにおける評価用データセット
- FLUTEC (NLP 2022)
- Evaluation dataset for Japanese grammatical error correction on fluency edits
- PheMT (COLING 2020)
- Phenomenon-wise dataset designed for evaluating the robustness of Japanese-English machine translation systems
- GitHub Typo Corpus (LREC 2020)
- Large-scale multilingual dataset of misspellings and grammatical errors
- TEA2015 (sub) → Accepted
- EMNLP2015 (main) → Rejected
- ACL2016 (main) → Rejected
- NAACL2019 (main) → Accepted
- BEA2019 (sub) → Accepted
- EMNLP2019 (sub) → Accepted
- LREC2020 (sub) → Accepted
- ACL2020 (main) → Withdraw
- ACL2020 (sub) ×2 → Accepted ×1, Rejected ×1
- ACL-SRW2020 (sub) → Accepted
- EMNLP2020 (main) → Accepted (Findings)
- COLING2020 (sub) ×2 → Accepted ×2
- ACL2021 (main) → Accepted (Findings)
- JNLP2021 (main) → Accepted
- JNLP2021 (sub) → Accepted
- JNLP2022 (sub) → Accepted
- LREC2022 (sub) ×2 → Accepted ×2
- EMNLP2022 (main) → Accepted (Findings) → Withdraw
- TALIP2023 (sub) → Accepted
- JNLP2023 (sub) → Accepted
- ACL2023 (main) → Rejected
- BEA2023 (sub) → Accepted
- EACL2023 (sub) → Accepted (Findings)
- EMNLP2023 (sub) → Accepted (Findings)
- AACL2023 (main) → Rejected
- JNLP2024 (sub) → Accepted
- LREC-COLING2024 (sub) ×3 → Accepted ×2, Rejected ×1
- TACL2024 (sub) ×3 → Accepted ×2, Rejected ×1
- BEA2024 (main) → Accepted
- BEA2024 (sub) → Accepted
- ACL2024 (main) → Accepted
- ACL2024 (sub) → Rejected
- ACL-SRW2024 (sub)×2 → Rejected ×2
- INLG2024 (sub) → Rejected
- COLM2024 (sub) → Rejected
- NeurIPS2024 D&B Track (sub) → Rejected
- JACIII2024 (sub) → Accepted
- PACLIC2024 (sub) → Accepted
- NAACL2025 (sub) → Accepted
- ACL2025 (main) → Accepted
- ACL2025 (sub)×2 → Accepted ×1, Rejected ×1
Academic Services
Organizer
Current
Past
- GenChal2022:FCG, Organizer (2019-2023)
Reviewer
CL, ACL, EMNLP, NAACL, ARR, LREC, COLING, JNLP, BEA …etc.
Research Grants
- 科研費 基盤B「深層学習による言語生成の評価データセットの構築と品質推定」, 共同研究者(代表者: 小町守)
Work Experience
- 2025.5 - Present Recruit Co.,Ltd., R&D Engineer
- 2025.4 - Present Hitotsubashi University, Visiting Researcher
- 2022.7 - Present RIKEN Center for Advanced Intelligence Project (AIP), Visiting Researcher
- 2022.6 - 2025.4 CyberAgent, Inc., Research Scientist
- 2021.10 - 2025.3 Tokyo Metropolitan University, Project Assistant Professor
- 2019.10 - 2020.12 Megagon Labs, Contract Researcher
- 2018.2 - 2022.5 RIKEN Center for Advanced Intelligence Project (AIP), Researcher
- 2016.4 - 2018.1 Microsoft Japan, Engineer
- 2015.10 - 2015.12 Rakuten Institute of Technology - New York (RIT-NY), Part-time Researcher
- 2014.8 - 2014.8 NTT Communication Science Laboratories, Part-time Researcher
Education
- 2024.4 - Present Ph.D student in The University of Tokyo (Supervisor: Associate prof. Yohei Oseki)
- 2018.10 - 2021.9 Ph.D. in Information Science, Tohoku University (Supervisor: Prof. Kentaro Inui)
- 2014.4 - 2016.3 M.S. in Engineering, Nara Institute of Science and Technology (NAIST) (Supervisor: Prof. Yuji Matsumoto)
- 2010.4 - 2014.3 B.A. , Prefectural University of Hiroshima