RGC Senior Research Fellow Scheme

Biography

Chair Professor in Department of Computer Science at City University of Hong Kong
Research interests include cloud and distributed computing systems, computer networks and communication, and secure data sharing and privacy-preserving computing for AI and Machine Learning
SRFS project — to develop new technologies to ensure data privacy and security in federated learning. As federated learning trains the model in a distributed fashion, some security problems may occur. This research will develop secure proof-of-data and proof-of-training protocols for federated learning platforms to ensure the training output is indeed generated from the prescribed algorithm over the certified data
Awards and Honours:
- RGC Senior Research Fellow (2024)
- IEEE TCDP (Technical Committee on Distributed Processing) Outstanding Service and Contributions Award (2024)
- ACM Distinguished Member (2018)

Project Title

Robust Aggregation, Proof-of-data and Proof-of-training in Federated Learning

Award Citation

Professor Jia is a distinguished research scientist. His research interests include cloud and distributed computing systems, computer networks and communication, and secure data sharing and privacy-preserving computing for AI and Machine Learning. Throughout his academic career of over 30 years, he has published over 450 research articles in top-tier journals and conferences in his fields and graduated over 30 PhD students. He received many best paper awards from competitive conferences, and the IEEE TCDP (Technical Committee on Distributed Processing) Outstanding Service and Contributions Award 2024. He is a fellow of IEEE and distinguished member of ACM.

The SRFS project led by Professor Jia aims to develop new technologies to ensure data privacy and security in federated learning. Federated Learning is a distributed ML method that the training happens at the local sites where data resides. Because data never goes outside of the client’s sites, federated learning is regarded as a method that can best protect data privacy and security. However, due to the training at the local sites, federated learning suffers from several major problems, including free-rides where clients do not have the training data but claim to have, lazy training where clients do not fully comply with the required training computation, and poisoning attacks to the global model by malicious clients. To address these security concerns, in this project, this team will develop protocols for proof-of-data and proof-of-training, to ensure the training output from the client is indeed generated from the certified data and by the prescribed training algorithm.

If successful, this project will open a new frontline for large-scale distributed ML, where the data scattered all over the world can be fully utilized to participate in the training at their local sites. This will truly unleash the power of the data and bring the performance of AI models to a new height.