Yuyang Dong

Me and GF

The picture is taken in Tsukuba hill with my wife (a Ph.D. major in Linguistics, University of Tsukuba).
We were tired in that time. (She is cute, right? ^.^)
We met in high school and married on 2016.
E-mail: dongyuyang@nec.com

Profile

Research Filed

  • Spatial index and vector search (PhD)
  • LLM/ML/NLP for DB, DB for LLM/ML/NLP
  • LLM/VLM for tabular data

News !!!

Publications

You can also see DBLP.

   2024

  1. Jellyfish: A Large Language Model for Data Preprocessing [Paper] [HF model]
    Haochen Zhang, Yuyang Dong, Chuan Xiao, Masafumi Oyamada
    The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024 Main Long)
  2. On the Use of Large Language Models for Table Tasks (Tutorial) [Github page]
    Yuyang Dong, Masafumi Oyamada, Chuan Xiao, Haochen Zhang
    33rd ACM International Conference on Information and Knowledge Management (CIKM 2024)
  3. Large Language Models as Data Preprocessors [Paper]
    Haochen Zhang, Yuyang Dong, Chuan Xiao, Masafumi Oyamada
    2nd International Workshop on Tabular Data Analysis, International Conference on Very Large Data Bases. (TaDA workshop@VLDB 2024)

   2023

  1. QA-Matcher: Unsupervised Entity Matching Using A Question Answering Model [Slide]
    Shogo Hayashi, Yuyang Dong, Masafumi Oyamada
    Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2023)
  2. DeepJoin: Joinable Table Discovery with Pre-trained Language Models [Slide]
    Yuyang Dong, Chuan Xiao, Takuma Nozawa, Masafumi Enomoto, Masafumi Oyamada
    International Conference on Very Large Data Bases. (VLDB 2023)
  3. CAGAIN: Column Attention Generative Adversarial Imputation Networks
    Jun Kawagoshi, Yuyang Dong, Takuma Nozawa, Chuan Xiao
    International Conference on Database and Expert Systems Applications (DEXA 2023)

   2022

  1. Table Enrichment System for Machine Learning [Paper] [Demo Youtube]
    Yuyang Dong, Masafumi Oyamada
    Demo paper, International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022)

   2021

  1. Efficient Joinable Table Discovery in Data Lakes: A High-Dimensional Similarity-Based Approach [Paper] [Extended Version]
    Yuyang Dong, Kunihiro Takeoka, Chuan Xiao, Masafumi Oyamada
    International Conference on Data Engineering (ICDE 2021)
  2. Quality Control for Hierarchical Classification with Incomplete Annotations
    Masafumi Enomoto, Kunihiro Takeoka, Yuyang Dong, Masafumi Oyamada, Takeshi Okadome
    Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2021)
  3. Entity Matching with String Transformation and Similarity-Based Features
    Kazunori Sakai,Yuyang Dong, Masafumi Oyamada, Kunihiro Takeoka, Takeshi Okadome
    Workshop on Software Foundations for Data Interoperability (SFDI 2021@VLDB 2021 Workshop)

   2020

  1. Learning from Unsure Responses
    Kunihiro Takeoka, Yuyang Dong, Masafumi Oyamada
    AAAI Conference on Artificial Intelligence (AAAI 2020)
  2. NGNC: A Flexible and Efficient Framework for Error-Tolerant Query Autocompletion
    Yukai Miao, Jianbin Qin, Sheng Hu, Yuyang Dong, Yoshiharu Ishikawa, Makoto Onizuka
    Workshop on Software Foundations for Data Interoperability (SFDI 2020@VLDB 2020 Workshop)
  3. Continuous Top-k Spatial-Keyword Search on Dynamic Objects [Paper]
    Yuyang Dong, Chuan Xiao, Hanxiong Chen, Jefferey Xu Yu, Kunihiro Takeoka, Masafumi Oyamada, and Hiroyuki Kitagawa
    The VLDB Journal, Springer. (VLDBJ)

   2019

  1. Continuous Search on Dynamic Spatial Keyword Objects [Paper]
    Yuyang Dong, Hanxiong Chen, Hiroyuki kitagawa
    Short paper, International Conference on Data Engineering (ICDE 2019)
  2. Balanced Nearest Neighborhood Query in Spatial Database
    Sang Le, Yuyang Dong, Hanxiong Chen, Kazutaka Furuse.
    Short paper. International Conference on Big Data and Smart Computing (BigComp 2019)

   2018

  1. Weighted Aggregate Reverse Rank Queries [Paper]
    Yuyang Dong, Hanxiong Chen, Jeffrey Xu Yu, Kazutaka Furuse, Hiroyuki Kitagawa.
    ACM Transactions on Spatial Algorithms and Systems (TSAS)
  2. Bound-and-filter Framework for Aggregate Reverse Rank Queries
    Yuyang Dong, Hanxiong Chen, Kazutaka Furuse, Hiroyuki kitagawa
    Transactions on Large-Scale Data and Knowledge-Centered Systems (TLDKS)
  3. Efficient Methods for Aggregate Reverse Rank Queries
    Yuyang Dong, Hanxiong Chen, Kazutaka Furuse, Hiroyuki Kitagawa
    IEICE Transactions on Information and Systems.

   2017

  1. Grid-Index algorithm for reverse rank queries. [Paper]
    Yuyang Dong, Hanxiong Chen, Jeffrey Xu Yu, Kazutaka Furuse, Hiroyuki Kitagawa
    International Conference on Extending Database Technology (EDBT 2017)
  2. Efficient Processing of Aggregate Reverse Rank Queries.
    Yuyang Dong, Hanxiong Chen, Hiroyuki kitagawa.
    Short paper. International Conference on Database and Expert Systems Applications (DEXA 2017)

   2016

  1. "Aggregate Reverse Rank Queries" [Paper]
    Yuyang Dong, Hanxiong Chen, Kazutaka Furuse, Hiroyuki Kitagawa
    International Conference on Database and Expert Systems Applications (DEXA 2016) (Best Paper Award).

Activity & Award

  1. Internship, Cloud & Solution Group Company, TOSHIBA JAPAN Co., Ltd. 2014.8, Two weeks.
  2. Internship, Smart Center, NTT DATA Co., Ltd. 2014.10, Two weeks.
  3. 筑波大学・学長表彰, 2019
  4. 筑波大学・システム情報工学研究科・博士後期課程・総代, 2019
  5. DBSJ, 日本データベース学会, 上林奨励賞(DBSJ Kambayashi Young Researcher Award), 2022
  6. Reviewer (or External): TKDE Journal, KAIS Journal, IEICE Journal, IEEE ACCESS Journal, IPSJ Journal,
    ICMR'18, ICML'19,20, ICSC'19, NeurIPS'19,20 AAAI'20,21, DASFAA'20
  7. PC or OC: DASFAA'20, DEIM'20, MIPR'21