An Australian man's voice is quietly whispering as he describes each pub / hotel / public house from the pictures on the screen. Autonomous sensory meridian response (ASMR) - These pubs are located in great old buildings, which are interesting and make this video interesting. It's the buildings that are interesting, rather than what happens inside. There are both pubs from Sydney CBD and Parramatta.
To listen while using other apps, switch the player to Picture-in-Picture (PiP) during playback — it keeps playing in a small floating window (the screen stays on).
To listen with the screen fully off, in-browser playback stops by YouTube's design. Open the video in the YouTube app to keep listening where background playback is supported (e.g. with YouTube Premium).
Open in YouTube app