PCD IO consistency on NFS - msync needed?

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

PCD IO consistency on NFS - msync needed?

kmatzen
I use pcl often on our cluster and the results get saved to NFS.  However, in many cases, depending on the server load, the pcd files are not written back properly.  I was wondering, don't you have to call msync before munmap to ensure the mapped memory is properly written back to disk?

Here's the changes that I think are necessary, but I haven't considered what might happen on Windows at all.  It's hard for me to reproduce the failure when these calls to msync aren't included, but what I can say is that I have not observed the issue once they are in place.

--- io/include/pcl/io/impl/pcd_io.hpp (revision 2665)
+++ io/include/pcl/io/impl/pcd_io.hpp (working copy)
@@ -216,6 +216,8 @@
     }
   }
 
+  msync(map, data_idx + data_size, MS_SYNC);
+
   // Unmap the pages of memory
 #if _WIN32
     UnmapViewOfFile (map);
@@ -387,6 +389,8 @@
   // Copy the compressed data
   memcpy (&map[data_idx], temp_buf, data_size);
 
+  msync(map, compressed_final_size, MS_SYNC);
+
   // Unmap the pages of memory
 #if _WIN32
     UnmapViewOfFile (map);
@@ -668,6 +672,8 @@
     }
   }
 
+  msync(map, data_idx + data_size, MS_SYNC);
+
   // Unmap the pages of memory
 #if _WIN32
     UnmapViewOfFile (map);
Reply | Threaded
Open this post in threaded view
|

Re: PCD IO consistency on NFS - msync needed?

Jochen Sprickerhof
Administrator
Hi kmatzen,

* kmatzen <[hidden email]> [2011-10-09 11:40]:
> I use pcl often on our cluster and the results get saved to NFS.  However, in
> many cases, depending on the server load, the pcd files are not written back
> properly.  I was wondering, don't you have to call msync before munmap to
> ensure the mapped memory is properly written back to disk?

according to the man page of mmap:
 The file may not actually be updated until msync(2) or munmap() is
 called.
see this discussion as well:
http://linux.derkeiler.com/Mailing-Lists/Kernel/2005-05/2398.html

So I think we should be fine without msync.

> Here's the changes that I think are necessary, but I haven't considered what
> might happen on Windows at all.  It's hard for me to reproduce the failure
> when these calls to msync aren't included, but what I can say is that I have
> not observed the issue once they are in place.

Could you try a simple program that just does the mmap to see if this is
not a problem with the NFS?

Cheers,

Jochen
_______________________________________________
PCL-developers mailing list
[hidden email]
http://pointclouds.org/mailman/listinfo/pcl-developers
http://pointclouds.org
Reply | Threaded
Open this post in threaded view
|

Re: PCD IO consistency on NFS - msync needed?

kmatzen
http://nfs.sourceforge.net/

"This document provides an introduction to NFS as implemented in the Linux kernel."

"D8. When my application uses memory-mapped NFS files, it breaks. Why?
A. Usually this is because application developers rely on certain local file system behaviors to guarantee data consistency, rather than reading the mmap man pages carefully to understand what behavior is required by all file system implementations. Some examples:

Although some implementations of munmap(2) happen to write dirty pages to local file systems, the NFS version of munmap(2) does not. An msync(2) call is always required to guarantee that dirty mapped data is written to permanent storage. A subtle ramification of the Linux NFS client's treatment of munmap(2) is that does not consider munmap(2) to be a close operation for the purposes of close-to-open cache coherency."
Reply | Threaded
Open this post in threaded view
|

Re: PCD IO consistency on NFS - msync needed?

nizar sallem
+1 here,

boost::mapped_files also use the sync before unmapping. I think for
consistency we should just remove our mapping functions for PCD and rely
on the ones provided by boost (as done for PLY).

Cheers,
--
Nizar

On 10/10/2011 07:21 AM, kmatzen wrote:

> http://nfs.sourceforge.net/
>
> "This document provides an introduction to NFS as implemented in the Linux
> kernel."
>
> "D8. When my application uses memory-mapped NFS files, it breaks. Why?
> A. Usually this is because application developers rely on certain local file
> system behaviors to guarantee data consistency, rather than reading the mmap
> man pages carefully to understand what behavior is required by all file
> system implementations. Some examples:
>
> Although some implementations of munmap(2) happen to write dirty pages to
> local file systems, the NFS version of munmap(2) does not. An msync(2) call
> is always required to guarantee that dirty mapped data is written to
> permanent storage. A subtle ramification of the Linux NFS client's treatment
> of munmap(2) is that does not consider munmap(2) to be a close operation for
> the purposes of close-to-open cache coherency."
>
> --
> View this message in context: http://www.pcl-developers.org/PCD-IO-consistency-on-NFS-msync-needed-tp4885942p4888328.html
> Sent from the Point Cloud Library (PCL) Developers mailing list archive at Nabble.com.
> _______________________________________________
> PCL-developers mailing list
> [hidden email]
> http://pointclouds.org/mailman/listinfo/pcl-developers
> http://pointclouds.org
_______________________________________________
PCL-developers mailing list
[hidden email]
http://pointclouds.org/mailman/listinfo/pcl-developers
http://pointclouds.org
Reply | Threaded
Open this post in threaded view
|

Re: PCD IO consistency on NFS - msync needed?

Radu B. Rusu
Administrator
In reply to this post by kmatzen
I'll test the effects of msync on a few systems today. Can someone do the same in Windows?

Personally, if msync solves your problem, and doesn't affect anything else, I vote we integrate it :) Other suggestions?

Cheers,
Radu.
--
Point Cloud Library (PCL) - http://pointclouds.org

On 10/10/2011 07:21 AM, kmatzen wrote:

> http://nfs.sourceforge.net/
>
> "This document provides an introduction to NFS as implemented in the Linux
> kernel."
>
> "D8. When my application uses memory-mapped NFS files, it breaks. Why?
> A. Usually this is because application developers rely on certain local file
> system behaviors to guarantee data consistency, rather than reading the mmap
> man pages carefully to understand what behavior is required by all file
> system implementations. Some examples:
>
> Although some implementations of munmap(2) happen to write dirty pages to
> local file systems, the NFS version of munmap(2) does not. An msync(2) call
> is always required to guarantee that dirty mapped data is written to
> permanent storage. A subtle ramification of the Linux NFS client's treatment
> of munmap(2) is that does not consider munmap(2) to be a close operation for
> the purposes of close-to-open cache coherency."
>
> --
> View this message in context: http://www.pcl-developers.org/PCD-IO-consistency-on-NFS-msync-needed-tp4885942p4888328.html
> Sent from the Point Cloud Library (PCL) Developers mailing list archive at Nabble.com.
> _______________________________________________
> PCL-developers mailing list
> [hidden email]
> http://pointclouds.org/mailman/listinfo/pcl-developers
> http://pointclouds.org
_______________________________________________
PCL-developers mailing list
[hidden email]
http://pointclouds.org/mailman/listinfo/pcl-developers
http://pointclouds.org
Reply | Threaded
Open this post in threaded view
|

Re: PCD IO consistency on NFS - msync needed?

Radu B. Rusu
Administrator
Bad news: using msync() reduces the effective I/O to 30% (from 24fps to 8fps) on writing binary compressed files.

Suggestions on how to solve this are welcome.

Cheers,
Radu.
--
Point Cloud Library (PCL) - http://pointclouds.org

On 10/10/2011 08:29 AM, Radu B. Rusu wrote:

> I'll test the effects of msync on a few systems today. Can someone do the same in Windows?
>
> Personally, if msync solves your problem, and doesn't affect anything else, I vote we integrate it :) Other suggestions?
>
> Cheers,
> Radu.
> --
> Point Cloud Library (PCL) - http://pointclouds.org
>
> On 10/10/2011 07:21 AM, kmatzen wrote:
>> http://nfs.sourceforge.net/
>>
>> "This document provides an introduction to NFS as implemented in the Linux
>> kernel."
>>
>> "D8. When my application uses memory-mapped NFS files, it breaks. Why?
>> A. Usually this is because application developers rely on certain local file
>> system behaviors to guarantee data consistency, rather than reading the mmap
>> man pages carefully to understand what behavior is required by all file
>> system implementations. Some examples:
>>
>> Although some implementations of munmap(2) happen to write dirty pages to
>> local file systems, the NFS version of munmap(2) does not. An msync(2) call
>> is always required to guarantee that dirty mapped data is written to
>> permanent storage. A subtle ramification of the Linux NFS client's treatment
>> of munmap(2) is that does not consider munmap(2) to be a close operation for
>> the purposes of close-to-open cache coherency."
>>
>> --
>> View this message in context: http://www.pcl-developers.org/PCD-IO-consistency-on-NFS-msync-needed-tp4885942p4888328.html
>> Sent from the Point Cloud Library (PCL) Developers mailing list archive at Nabble.com.
>> _______________________________________________
>> PCL-developers mailing list
>> [hidden email]
>> http://pointclouds.org/mailman/listinfo/pcl-developers
>> http://pointclouds.org
_______________________________________________
PCL-developers mailing list
[hidden email]
http://pointclouds.org/mailman/listinfo/pcl-developers
http://pointclouds.org
Reply | Threaded
Open this post in threaded view
|

Re: PCD IO consistency on NFS - msync needed?

kmatzen
I feel like your users probably fall into two camps.  There are those that are doing real time data acquisition and there are those that are doing offline processing.  Based on your notes that your laptop disks write at ~100 mb/s (btw, what "normal" laptop disks are you using that achieve this...I get 80 mb/s on a good "normal" laptop disk) I would suspect that if you did experience data corruption, then it would be infrequent and is easily detected via incoherent header.  However, those in the offline processing camp have no excuse for corrupted data since there is presumably all the time in the world to wait for the write-back.  Maybe this could be introduced as a compile option to choose either correctness or performance.  OpenCV users already have to make this choice with -ffast-math according to the recent post.  Although, I would suspect that the performance of all this mmap'ing stuff would degrade to the naive write approach once you introduce the blocking msync, unless I'm missing something important (I don't understand the stripping comment).

The only other thing I can think of is to use msync with MS_ASYNC to prevent the apparently expensive blocking call, but then you will have a more difficult time managing the lifecycle of the mapping and will presumably have a larger memory footprint.  You'll probably end up blocking to write-back the backlog anyway.

In any case, thanks for looking into this. :)
Reply | Threaded
Open this post in threaded view
|

Re: PCD IO consistency on NFS - msync needed?

Radu B. Rusu
Administrator
Adding it as an option to the writer makes sense to me. However, in order to preserve the 1.x API, I would add it only
to PCDWriter, and set it to false for the savePCD* methods in the pcl namespace.

Would this work for you? Something along the lines of:

PCDWriter w;
w.setSynchronization (true);


Cheers,
Radu.
--
Point Cloud Library (PCL) - http://pointclouds.org

On 10/10/2011 11:26 AM, kmatzen wrote:

> I feel like your users probably fall into two camps.  There are those that
> are doing real time data acquisition and there are those that are doing
> offline processing.  Based on your notes that your laptop disks write at
> ~100 mb/s (btw, what "normal" laptop disks are you using that achieve
> this...I get 80 mb/s on a good "normal" laptop disk) I would suspect that if
> you did experience data corruption, then it would be infrequent and is
> easily detected via incoherent header.  However, those in the offline
> processing camp have no excuse for corrupted data since there is presumably
> all the time in the world to wait for the write-back.  Maybe this could be
> introduced as a compile option to choose either correctness or performance.
> OpenCV users already have to make this choice with -ffast-math according to
> the recent post.  Although, I would suspect that the performance of all this
> mmap'ing stuff would degrade to the naive write approach once you introduce
> the blocking msync, unless I'm missing something important (I don't
> understand the stripping comment).
>
> The only other thing I can think of is to use msync with MS_ASYNC to prevent
> the apparently expensive blocking call, but then you will have a more
> difficult time managing the lifecycle of the mapping and will presumably
> have a larger memory footprint.  You'll probably end up blocking to
> write-back the backlog anyway.
>
> In any case, thanks for looking into this. :)
>
> --
> View this message in context: http://www.pcl-developers.org/PCD-IO-consistency-on-NFS-msync-needed-tp4885942p4889169.html
> Sent from the Point Cloud Library (PCL) Developers mailing list archive at Nabble.com.
> _______________________________________________
> PCL-developers mailing list
> [hidden email]
> http://pointclouds.org/mailman/listinfo/pcl-developers
> http://pointclouds.org
_______________________________________________
PCL-developers mailing list
[hidden email]
http://pointclouds.org/mailman/listinfo/pcl-developers
http://pointclouds.org
Reply | Threaded
Open this post in threaded view
|

Re: PCD IO consistency on NFS - msync needed?

kmatzen
That makes sense to me.
Reply | Threaded
Open this post in threaded view
|

Re: PCD IO consistency on NFS - msync needed?

Radu B. Rusu
Administrator
Done in r2688 (trunk). Will merge into branches/pcl-1.x in a separate commit.

Cheers,
Radu.
--
Point Cloud Library (PCL) - http://pointclouds.org

On 10/10/2011 11:34 AM, kmatzen wrote:

> That makes sense to me.
>
> --
> View this message in context: http://www.pcl-developers.org/PCD-IO-consistency-on-NFS-msync-needed-tp4885942p4889222.html
> Sent from the Point Cloud Library (PCL) Developers mailing list archive at Nabble.com.
> _______________________________________________
> PCL-developers mailing list
> [hidden email]
> http://pointclouds.org/mailman/listinfo/pcl-developers
> http://pointclouds.org
_______________________________________________
PCL-developers mailing list
[hidden email]
http://pointclouds.org/mailman/listinfo/pcl-developers
http://pointclouds.org
Reply | Threaded
Open this post in threaded view
|

Re: PCD IO consistency on NFS - msync needed?

Radu B. Rusu
Administrator
In reply to this post by nizar sallem
Nizar,

We added this a long time ago. :)

Cheers,
Radu.
--
http://pointclouds.org

On 10/10/2011 07:43 AM, NIzar Sallem wrote:

> +1 here,
>
> boost::mapped_files also use the sync before unmapping. I think for consistency we should just remove our mapping
> functions for PCD and rely on the ones provided by boost (as done for PLY).
>
> Cheers,
> --
> Nizar
>
> On 10/10/2011 07:21 AM, kmatzen wrote:
>> http://nfs.sourceforge.net/
>>
>> "This document provides an introduction to NFS as implemented in the Linux
>> kernel."
>>
>> "D8. When my application uses memory-mapped NFS files, it breaks. Why?
>> A. Usually this is because application developers rely on certain local file
>> system behaviors to guarantee data consistency, rather than reading the mmap
>> man pages carefully to understand what behavior is required by all file
>> system implementations. Some examples:
>>
>> Although some implementations of munmap(2) happen to write dirty pages to
>> local file systems, the NFS version of munmap(2) does not. An msync(2) call
>> is always required to guarantee that dirty mapped data is written to
>> permanent storage. A subtle ramification of the Linux NFS client's treatment
>> of munmap(2) is that does not consider munmap(2) to be a close operation for
>> the purposes of close-to-open cache coherency."
>>
>> --
>> View this message in context: http://www.pcl-developers.org/PCD-IO-consistency-on-NFS-msync-needed-tp4885942p4888328.html
>> Sent from the Point Cloud Library (PCL) Developers mailing list archive at Nabble.com.
>> _______________________________________________
>> PCL-developers mailing list
>> [hidden email]
>> http://pointclouds.org/mailman/listinfo/pcl-developers
>> http://pointclouds.org
> _______________________________________________
> PCL-developers mailing list
> [hidden email]
> http://pointclouds.org/mailman/listinfo/pcl-developers
> http://pointclouds.org
_______________________________________________
PCL-developers mailing list
[hidden email]
http://pointclouds.org/mailman/listinfo/pcl-developers
http://pointclouds.org