Funnily enough, Plan 9 uses concurrency (via CSP) as a way of structuring programs to make them easier to write and reason about.
Concurrency isn't hard; Sharing is hard. This is the reason that mutable state hurts understandability - things can change under you because the view of state is shared. This is the reason global variables are bad - they are shared between different parts of the program, and can be modified from places you can't see. And this is the reason shared-memory concurrency is hard. Add to shared-memory concurrency a dash of race conditions in operations involving a load and a store, and things get hard.
Of course, I'm oversimplifying as I write this, but I think that's really the gist of it.
This is why I generally prefer event-driven programming over multithread programming. This is the sort of programming you do, for example, when you are implementing a GUI: you write short event handler functions that respond to events (such as mouse clicks or the completion of network or disk reads), and then a master loop receives events and dispatches the handler functions.
You get the same benefits as concurrency (maximum processor usage, at least when you only have one processor), but you avoid the pitfalls of shared memory, race conditions, and unexpected state changes (because only one event handler runs at a time). It does take a bit of getting used to (it seems so easy to make function calls like time() or mkdir() that are actually blocking and thus technically should be separated into a new callback function). But I find that the benefits of processor efficiency plus the comfort of avoiding concurrency problems are worthwhile.
The point that I was trying to make is that a well designed concurrent program is often easier to reason about than an event driven program. Even if it didn't have benefits scaling to multiple processors and hiding latency, concurrent programming would have advantages in clearly structuring programs.
I think I agree with most of that. I'm definitely a big fan of CSP, and some of its ideological successors. To this day I think that Occam is one of the most ingeniously elegant languages ever invented.
I also think that making a distinction between concurrency and sharing is rather artificial. Embarrassingly parallel problems aside (where the concurrency is really better characterised as co-habitation of a computing resource), it's still pretty easy to get both shared-memory and channel-based message-passing wrong in concurrent situations. I agree that the presence of shared-memory makes it worse, and while my experiences using things like CML without using mutable storage have been much more positive than say... Java concurrency primitives, I think it's still "hard" in that it requires considerably more intellectual effort than writing non-concurrent programs of similar scope and size.
While some cases are trivial, and MapReduce is pretty trivial, sharing is an integral part of doing most things practically applicable. In other words, the majority of real world applications of concurrency are hard.
I think that languages like Erlang, and many of the functional lanugages, show that you don't actually need shared ownership and shared mutation for a large proportion of the problems out there. (Shared constant values, of course, are not a problem.)
Linear typing can also help by enforcing ownership transfers between workers operating on data, allowing efficient data transfer without sharing.
Concurrency isn't hard; Sharing is hard. This is the reason that mutable state hurts understandability - things can change under you because the view of state is shared. This is the reason global variables are bad - they are shared between different parts of the program, and can be modified from places you can't see. And this is the reason shared-memory concurrency is hard. Add to shared-memory concurrency a dash of race conditions in operations involving a load and a store, and things get hard.
Of course, I'm oversimplifying as I write this, but I think that's really the gist of it.