General purpose computing architectures are evolving quickly to become many-core and hierarchical: i.e. a core can communicate more quickly locally than globally. To be effective on such architectures programming models must be aware of the communication hierarchy, and yet preserve performance portability.We propose four new architecture-aware constructs for the parallel Haskell extension GpH that exploit information about task size and aim to reduce communication for small tasks, preserve data locality, or to distribute large units of work. We report a preliminary investigation of architecture-aware programming models that abstract over the new constructs. In particular we propose architecture-aware evaluation strategies and skeletons. We investigate three common parallel paradigms on hierarchical architectures with up to 224 cores. The results show that the architecture–aware constructs consistently deliver better speedup and scalability than existing constructs together with dramatically reduced variability. In some experiments speedup is improved by an order of magnitude.