Generate dialogue from English context
Generate Chinese dialogue from context
Extract entities and their types from Chinese questions
Segment objects in images using points or text