Deployed next-generation architectures and systems are characterized by high concurrency, low memory per core, and multilevels of hierarchy and heterogeneity. These characteristics bring out new challenges in energy efficiency, fault-tolerance, and scalability. Next-generation programming models and their associated middleware and runtimes have a responsibility to tackle these challenges. This talk focuses on challenges and opportunities in designing efficient runtimes using a formula (MPI+X) to accelerate applications for emerging high-performance computing (HPC) systems with millions of processors and featuring next-generation interconnects. Energy-aware designs and codesign schemes for such environments are also emphasized. View features and sample performance numbers from the MVAPICH2 libraries.