I would second the proposition of having a 'libcouchbase thread' that handles command and communicates using some kind of queue. The problem you are having seems to be socket buffer synchronization.
Specifically whenever you schedule an lcb command (i.e. "lcb_store") it writes a packet to our own socket buffer. Considering libcouchbase is asynchronous, it will flush that buffer at some unknown time to the network (thereby modifying the socket buffer).
Therefore even if you ensure that you lock your handle with things like pthread_mutex_lock and such, you will still encounter issues unless you can also control when the socket buffer is actually flushed (i.e. when 'lcb_wait' is called).
As such, you'd probably need a libcouchbase "server" thread.
One possible implementation is having this thread run an event framework (e.g. libev) which will drive libcouchbase. In addition to the thread running libcouchbase asynchronously, it will also accept command proxies over a queue (i.e. proxied requests for "lcb_store", "lcb_get", etc.).
You may have this thread then schedule (within its own thread context) the command to libcouchbase. The thread will know about when there are commands available to shcedule by determining if there are items in the queue. This may be done in a non-blocking fashion by potentially constructing a 'socketpair', in which a byte of dummy data is written each time something si added to the queue.
Subsequently, the "libcouchbase thread" will receive callbacks. You can proxy these callbacks out to your own user-defined callbacks.
While this sounds complicated, I am guessing the whole thing (assuming you actually have a "Queue" implemented [ which being C++, you have several native structures to fit this model ]) shouldn't take up more than 300 lines of code.