The data science revolution is finally enabling the development of large-scale data-driven models that provide real- or near-real-time forecasts and risk analysis for infectious disease threats. These models also provide rationales and quantitative analysis to support policy making decisions and intervention plans. At the same time, the non-incremental advance of the field presents a broad range of challenges: algorithmic (multiscale constitutive equations, scalability, parallelization), real time integration of novel digital data streams (social networks, participatory platform, human mobility etc.). I will review and discuss recent results and challenges in the area, and focus on ongoing work aimed at responding to the COVID-19 pandemic.