Efficient Image segmentation often gets under-constrained by relying merely on the visual features such as colour, texture, boundary and objects. This is because an expert analysis often involves some extra information besides the underlying image features. As an example, a doctor looking at an X-ray would have different interpretation as compared to the patient or a person without medical science background. This is because there is a need to associate another modality such as text specific to the application, to complement the missing gap of visual features in order to design a hybrid semantic system. Moreover medical images often come with a bag of errors from noise, signal artifacts, compression, numerical representation, non-steady focus or luminance such as in case of capsule endoscopy. Therefore, generating numerical vectors for machine learning model only from visual attributes will not be accurate and will miss the contextual information. This missing information can be provided by the linguistic cues associated with an image. Thus, both the visual features and textual information obtained from the expert resources should be fused together to evolve next generation multimodal information retrieval and image segmentation.
Husanbir is working as an Assistant Professor in Computer Science and Engineering Department, Thapar Institute Patiala India. His research interests are Machine Learning, Image Processing, Data Analysis, Multimodal Learning systems and Numerical Optimizations. He was a postdoc research fellow at Trinity College Dublin Ireland (2019). He received his PhD from University of North Texas USA (2012), Masters from California State University Eastbay USA (2006). He has published over 40 research papers including 18 in SCIE journals, and 6 in the conferences held outside India. He is also an active peer reviewer of many internationally reputed journals.