Abstract: Large-scale vision foundation models have made significant progress in visual tasks on natural images, with vision transformers (ViTs) being the primary choice due to their good scalability ...
In recent years, the field of unmanned aerial vehicle (UAV) tracking has grown rapidly, finding numerous applications across various industries. While the discriminative correlation filters ...