Kashmiri student Hanan Gani has achieved a significant milestone in the field of artificial intelligence, successfully researching and publishing papers at top global conferences. This accomplishment comes as a result of his master’s degree in machine learning from Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) in Abu Dhabi.
Gani, who studied Electronics and Communications Engineering from National Institute of Technology (NIT) Srinagar, is a resident of north Kashmir’s Baramulla. He previously worked at Samsung India as a machine learning engineer.
“It was a great experience working for such a large company, but I was mainly working on existing projects, and I was keen to work on something completely new. This is what led me to study for a master’s in machine learning at MBZUAI,” Gani said. “I decided to take the plunge and leave my job for academia and pursue my dream of researching and publishing papers at top conferences.”
He was part of a cohort of 101 students from 22 countries to graduate on June 6 in MBZUAI’s Class of 2024, which is the world’s first university specializing in AI research. The largest nationality was UAE, with 24 students, and India was one of the top countries represented along with Egypt.
The ceremony was attended by His Highness Sheikh Khaled bin Mohamed bin Zayed Al Nahyan, Crown Prince of Abu Dhabi and Chairman of Abu Dhabi Executive Council, who congratulated the three Ph.D. and 98 master’s degree recipients.
Gani wasted no time tapping into MBZUAI’s wide reserves of expertise and immersed himself in AI challenges including enabling algorithms to learn from limited data and getting multimodal models to achieve better results by learning from images and text.
“This area of research is important because machines and many types of portable devices should be able to understand images and language,” Gani said. “In addition to improving the capabilities of robots and autonomous devices to learn and perform useful tasks, it also improves the efficiency with which they can learn.”
Gani wanted to explore this field because there remains plenty of work to be done, and it aligned well with his research interests. “I looked for gaps in the research and areas where the models are not performing especially well,” he explained. “I observed that models don’t work well on long text prompts, so I combined the two methods – large language models and diffusion models. I felt inspired to work on some of the challenges that affect the services that tech giants provide.”
Text to image (T2I) generators caused a buzz when they rose to prominence a couple of years ago with the launch of tools such as Dall-E, Midjourney and Adobe Firefly. However, users quickly found that anything more than a simple prompt of a few words would confuse the system and lead to images that either looked strange or failed to fulfil the request. This problem has continued to elude the industry.
Gani, who was advised by Dr. Salman Khan, associate professor of computer vision at MBZUAI, has published three papers at conferences including the International Conference on Learning Representations (ICLR), The British Machine Vision Conference (BMVC), and the Conference on Neural Information Processing Systems (NeurIPS) and one work is under review at International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI).
The ICLR paper has already been cited more than 20 times, underscoring the potential impact of the research. However, Gani admits that the project was not always plain sailing, and there were times when he doubted whether it was going in the right direction.
“My method aims to enhance accuracy by producing images that accurately fit the complex and lengthy text,” he said. “We improved upon the existing main technique and were able to generate images from the text that exactly follow the details of the text, and to the best of our knowledge, we are the first in the machine learning community to have done so.”
Gani is also optimistic that his research work on efficient learning has the potential to do good by helping bring the benefits of machine learning to regions that lack robust energy and data networks. “My methods are mostly label efficient and work in low resource environments where there is a lack of data,” he added. “It could have multiple uses such as bringing AI to scanning devices in hospitals and helping autonomous vehicles to recognize objects and signs.”
Following his graduation, Gani is keen to continue this line of research and undertake a Ph.D at MBZUAI, which is recognized as one of the world’s top 100 computer science universities and is ranked in the top 20 for its specializations in AI, computer vision, machine learning, natural language processing, and robotics, according to CSRankings.
Gani attributes his positive experience at MBZUAI to the quality of teaching and mentoring. “My supervisor, Dr. Salman Khan, gave me a lot of freedom to research the areas I found most compelling and was always very supportive and ready to share insights and advice. I’ve been very fortunate to work with him,” Gani said.
“My mentor, Dr. Muzammal Naseer, a research scientist at MBZUAI, was also generous with his knowledge and experience, and helped guide my research. My secondary supervisor Professor Fahad Khan, deputy department chair of computer vision, and professor of computer vision, at MBZUAI, also extended his support whenever I needed.”
This year’s class joins the university’s growing alumni network of 111 AI leaders who are shaping the evolution of technology and AI across multiple sectors. MBZUAI is crucial in positioning Abu Dhabi and the UAE as an international hub for AI excellence – the Silicon Valley of this region.
MBZUAI offers a Ph.D. and master’s degree in five AI specializations – computer science, computer vision, machine learning, natural language processing, and robotics. To apply, visit www.mbzuai.ac.ae.