Forum Discussion

T_N's avatar
T_N
Level 6
7 years ago

Error bptm (pid=214108) cannot write image to disk, media close failed with status 2060019

07/17/2017 18:09:04 - Info bpbrm (pid=214105) ACDCFPS1 is the host to backup data from
07/17/2017 18:09:04 - Info bpbrm (pid=214105) reading file list for client
07/17/2017 18:09:04 - Info bpbrm (pid=214105) accelerator enabled
07/17/2017 18:09:05 - Info bpbrm (pid=214105) There is no complete backup image match with track journal, a regular full backup will be performed.
07/17/2017 18:09:05 - Info bpbrm (pid=214105) starting bpbkar on client
07/17/2017 18:09:05 - Info bpbkar (pid=68447) Backup started
07/17/2017 18:09:05 - Info bpbrm (pid=214105) bptm pid: 214108
07/17/2017 18:09:05 - Info bpbkar (pid=68447) INF - Backing up vCenter server CDCVCENTER1, ESX host cdcucsesx26.commscope.com, BIOS UUID 421fb1ec-4798-b361-1ba4-cc101da57d1f, Instance UUID 501fdf96-93a9-a03b-3f69-cbbf823585fd, Display Name ACDCFPS1, Hostname ACDCFPS1.commscope.com
07/17/2017 18:09:05 - Info bptm (pid=214108) start
07/17/2017 18:09:05 - Info bptm (pid=214108) using 1048576 data buffer size
07/17/2017 18:09:05 - Info bptm (pid=214108) using 512 data buffers
07/17/2017 18:09:06 - Info bptm (pid=214108) start backup
07/17/2017 18:10:36 - Info bptm (pid=214108) backup child process is pid 214973
07/17/2017 18:13:08 - Info bpbkar (pid=68447) INF - Transport Type =  nbd
07/17/2017 18:36:27 - Info nbjm (pid=31999) starting backup job (jobid=11960332) for client ACDCFPS1, policy VMW-Tier4, schedule Incremental
07/17/2017 18:36:27 - Info nbjm (pid=31999) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=11960332, request id:{C08988B8-6B48-11E7-947B-BCEF847371E6})
07/17/2017 18:36:27 - requesting resource stu_disk_cdcnetbudd6
07/17/2017 18:36:27 - requesting resource cdcnetbu02.NBU_CLIENT.MAXJOBS.ACDCFPS1
07/17/2017 18:36:27 - requesting resource cdcnetbu02.VMware.Datastore.CDCVCENTER1/CDC
07/17/2017 18:36:27 - requesting resource cdcnetbu02.VMware.Datastore.CDCVCENTER1/CDC
07/17/2017 18:36:27 - requesting resource cdcnetbu02.VMware.Datastore.CDCVCENTER1/CDC
07/17/2017 18:36:27 - requesting resource cdcnetbu02.VMware.Datastore.CDCVCENTER1/CDC
07/17/2017 18:36:27 - requesting resource cdcnetbu02.VMware.ESXserver.cdcucsesx26.commscope.com
07/17/2017 18:36:33 - granted resource  cdcnetbu02.NBU_CLIENT.MAXJOBS.ACDCFPS1
07/17/2017 18:36:33 - granted resource  cdcnetbu02.VMware.Datastore.CDCVCENTER1/CDC - Chicago Data Center/CDC_C1_VPLEX2_UCSPRD01_VMDATA_400
07/17/2017 18:36:33 - granted resource  cdcnetbu02.VMware.Datastore.CDCVCENTER1/CDC - Chicago Data Center/CDC_C1_VPLEX2_UCSPRD01_VMDATA_401
07/17/2017 18:36:33 - granted resource  cdcnetbu02.VMware.Datastore.CDCVCENTER1/CDC - Chicago Data Center/CDC_C1_VPLEX2_UCSPRD01_VMDATA_405
07/17/2017 18:36:33 - granted resource  cdcnetbu02.VMware.Datastore.CDCVCENTER1/CDC - Chicago Data Center/CDC_C1_VPLEX2_UCSPROD01_VMDATA_80
07/17/2017 18:36:33 - granted resource  cdcnetbu02.VMware.ESXserver.cdcucsesx26.commscope.com
07/17/2017 18:36:33 - granted resource  MediaID=@aaaak;DiskVolume=PureDiskVolume;DiskPool=dp_disk_cdcnetbudd6;Path=PureDiskVolume;StorageServer=cdcnetbudd6;MediaServer=cdcnetbudd6
07/17/2017 18:36:33 - granted resource  stu_disk_cdcnetbudd6
07/17/2017 18:36:33 - estimated 5193560994 kbytes needed
07/17/2017 18:36:33 - Info nbjm (pid=31999) started backup (backupid=ACDCFPS1_1500334593) job for client ACDCFPS1, policy VMW-Tier4, schedule Incremental on storage unit stu_disk_cdcnetbudd6 using backup host cdcnetbudd5
07/17/2017 18:36:33 - started process bpbrm (pid=214105)
07/17/2017 18:36:34 - connecting
07/17/2017 18:36:35 - connected; connect time: 0:00:00
07/17/2017 18:38:06 - begin writing
07/17/2017 22:31:37 - Critical bptm (pid=214108) Storage Server Error: (Storage server: PureDisk:cdcnetbudd6) mtstrm_close_write_channel: Fatal error occured in Multi-Threaded Agent: Close Write Channel command failed: Cr_ErrnoException: Timed out after waiting 1200s to send command Close Write Channel to mtstrmd V-454-96
07/17/2017 22:31:37 - Critical bptm (pid=214108) sts_close_handle failed: 2060019 error occurred on network socket
07/17/2017 22:31:37 - Error bptm (pid=214108) cannot write image to disk, media close failed with status 2060019
07/17/2017 22:32:04 - Info bptm (pid=214108) EXITING with status 87 <----------
07/17/2017 22:32:05 - Info cdcnetbudd6 (pid=214108) StorageServer=PureDisk:cdcnetbudd6; Report=PDDO Stats (multi-threaded stream used) for (cdcnetbudd6): scanned: 307202052 KB, CR sent: 141464738 KB, CR sent over FC: 0 KB, dedup: 54.0%, cache disabled
07/17/2017 22:32:10 - Info bpbkar (pid=0) done
07/17/2017 22:32:10 - Info bpbkar (pid=0) done. status: 87: media close error
07/17/2017 22:59:41 - end writing; write time: 4:21:35
media close error  (87)

5 Replies

  • Hi T_N,

    Because of this message i think that can be something with your Multi-Threaded Agent. Are you using MSDP right? If yes take a look below.

    If Im not mistaken there is a parameter called SessionCloseTimeout that you can increase..According to this log the value is setted fot 1200s...Maybe this can help you.

    Log.

    07/17/2017 22:31:37 - Critical bptm (pid=214108) Storage Server Error: (Storage server: PureDisk:cdcnetbudd6) mtstrm_close_write_channel: Fatal error occured in Multi-Threaded Agent: Close Write Channel command failed: Cr_ErrnoException: Timed out after waiting 1200s to send command Close Write Channel to mtstrmd V-454-96

    Sorry for long post, I would like to post only the TNS, but both are not available...I needed to copy from my Evernote...this is a problem that is happening with some Veritas Technotes....you can see if you want.

    https://www.veritas.com/support/en_US/article.000072903
    https://www.veritas.com/support/en_US/article.000072906

     

    About the MSDP Deduplication Multi-Threaded Agen

    Beginning with the NetBackup 7.6 release, the MSDP deduplication process can use a Multi-Threaded Agent for most data sources. The Multi-Threaded Agent runs alongside the deduplication plug-in on both the clients and the media servers. The agent uses multiple threads for asynchronous network I/O and CPU core calculations. During a backup, this agent receives data from the deduplication plug-in through shared memory and processes it using multiple threads to improve throughput performance. When inactive, the agent uses minimal resources.The NetBackup Deduplication Multi-Threaded Agent improves backup performance for both client-side deduplication and media server deduplication.The Deduplication Multi-Threaded Agent uses the default configuration values that control its behavior. You can change those values if you want to do so. The following table describes the Multi-Threaded Agent interactions and behaviors. It also provides links to the topics that describe how to configure those interactions and behaviors.

    MSDP mtstrm.conf file parameters

    The mtstrm.conf configuration file controls the behavior of the Deduplication Multi-threaded Agent. The default values balance performance with resource usage.

    A procedure exists that describes how to configure these parameters.

    The pd.conf file resides in the following directories:

    • (UNIX) /usr/openv/lib/ost-plugins/

    • (Windows) install_path\Veritas\NetBackup\bin\ost-plugins

    See Configuring the Deduplication Multi-Threaded Agent behavior.

    The mtstrm.conf file is comprised of three sections. The parameters must remain within their sections. For descriptions of the parameters, see the following sections:

    The mtstrm.conf file resides in the following directories:

    • UNIX: /usr/openv/lib/ost-plugins/

    • Windows: install_path\Veritas\NetBackup\bin\ost-plugins

     

    Logging parameters

    The following table describes the logging parameters of the mtstrm.conf configuration file.

    Table: Logging parameters (mtstrm.conf file)

    Parameter

    Description

    LogPath

    The directory in which the mtstrmd.log files are created.

    Default values:

    • Windows: LogPath=install_path\Veritas\pdde\\..\netbackup\logs\pdde

    • UNIX: LogPath=/var/log/puredisk

    Logging

    Specify what to log:

    Default value: Logging=short,thread.

    Possible values:

    minimal: Critical, Error, Authentication, Bug
    short  : all of the above plus Warning
    long   : all of the above plus Info
    verbose: all of the above plus Notice
    full   : all of the above plus Trace messages (everything)
    none   : disable logging

    To enable or disable other logging information, append one of the following to the logging value, without using spaces:

    ,thread  : enable thread ID logging.
    ,date    : enable date logging.
    ,timing  : enable high-resolution timestamps
    ,silent  : disable logging to console

    Retention

    How long to retain log files (in days) before NetBackup deletes them.

    Default value: Retention=7.

    Possible values: 0-9, inclusive. Use 0 to keep logs forever.

    LogMaxSize

    The maximum log size (MB) before NetBackup creates a new log file. The existing log files that are rolled over are renamed mtstrmd.log.<date/time stamp>

    Default value: LogMaxSize=500.

    Possible value: 1 to the maximum operating system file size in MBs, inclusive.

     

    Process parameters

    The following table describes the process parameters of the mtstrm.conf configuration file.

    Table: Process parameters (mtstrm.conf file)

    Parameter

    Description

    MaxConcurrentSessions

    The maximum number of concurrent sessions that the Multi-Threaded Agent processes. If it receives a backup job when the MaxConcurrentSessions value is reached, the job runs as a single-threaded job.

    By default, the deduplication plug-in sends backup jobs to the Multi-Threaded Agent on a first-in, first-out basis. However, you can configure which clients and which backup policies the deduplication plug-in sends to the Multi-Threaded Agent. The MTSTRM_BACKUP_CLIENTS and MTSTRM_BACKUP_POLICIES parameters in the pd.conf control the behavior. Filtering the backup jobs that are sent to the Multi-Threaded Agent can be very helpful on the systems that have many concurrent backup jobs.

    See MSDP pd.conf file parameters.

    Default value: MaxConcurrentSessions= (calculated by NetBackup; see the following paragraph).

    NetBackup configures the value for this parameter during installation or upgrade. The value is the hardware concurrency value of the host divided by the BackupFpThreads value (see Table: Threads parameters (mtstrm.conf file)). (For the purposes of this parameter, the hardware concurrency is the number of CPUs or cores or hyperthreading units.) On media servers, NetBackup may not use all hardware concurrency for deduplication. Some may be reserved for other server processes.

    For more information about hardware concurrency, see the pd.conf file MTSTRM_BACKUP_ENABLED parameter description.

    See MSDP pd.conf file parameters.

    Possible values: 1-32, inclusive.

    Warning:

    Symantec recommends that you change this value only after careful consideration of how the change affects your system resources. With default configuration values, each session uses approximately 120 to 150 MBs of memory. The memory that is used is equal to (BackupReadBufferCount * BackupReadBufferSize) + (3 * BackupShmBufferSize) + FpCacheMaxMbSize (if enabled).

    BackupShmBufferSize

    The size of the buffers (MB) for shared memory copying. This setting affects three buffers: The shared memory buffer itself, the shared memory receive buffer in the mtstrmd process, and the shared memory send buffer on the client process.

    Default value: BackupShmBufferSize=2 (UNIX) or BackupShmBufferSize=8 (Windows).

    Possible values: 1-16, inclusive.

    BackupReadBufferSize

    The size (MB) of the memory buffer to use per session for read operations from a client during a backup.

    Default value: BackupReadBufferSize=32.

    Possible values: 16-128, inclusive.

    BackupReadBufferCount

    The number of memory buffers to use per session for read operations from a client during a backup.

    Default value: BackupReadBufferCount=3.

    Possible values: 1 to 10, inclusive.

    BackupBatchSendEnabled

    Determines whether to use batch message protocols to send data to the storage server for a backup.

    Default value: BackupBatchSendEnabled=1.

    Possible values: 0 (disabled) or 1 (enabled).

    FpCacheMaxMbSize

    The maximum amount of memory (MB) to use per session for fingerprint caching.

    Default value: FpCacheMaxMbSize=20.

    Possible values: 0-1024, inclusive.

    SessionCloseTimeout

    The amount of time to wait in seconds for threads to finish processing when a session is closed before the agent times-out with an error.

    Default value: 180.

    Possible values: 1-3600.

    SessionInactiveThreshold

    The number of minutes for a session to be idle before NetBackup considers it inactive. NetBackup examines the sessions and closes inactive ones during maintenance operations.

    Default value: 480.

    Possible values: 1-1440, inclusive.

    Threads parameters

    The following table describes the threads parameters of the mtstrm.conf configuration file.

    Table: Threads parameters (mtstrm.conf file)

    Parameter

    Description

    BackupFpThreads

    The number of threads to use per session to fingerprint incoming data.

    Default value: BackupFpThreads= (calculated by NetBackup; see the following explanation).

    NetBackup configures the value for this parameter during installation or upgrade. The value is equal to the following hardware concurrency threshold values.

    • Windows and Linux: The threshold value is 2.

    • Solaris: The threshold value is 4.

    For more information about hardware concurrency, see the pd.conf file MTSTRM_BACKUP_ENABLED parameter description.

    See MSDP pd.conf file parameters.

    BackupSendThreads

    The number of threads to use per session to send data to the storage server during a backup operation.

    Default value: BackupSendThreads=1 for servers and BackupSendThreads=2 for clients.

    Possible values: 1-32, inclusive.

    MaintenanceThreadPeriod

    The frequency at which NetBackup performs maintenance operations, in minutes.

    Default value: 720.

    Possible values: 0-10080, inclusive. Zero (0) disables maintenance operations.

     

    Regards,

     

    Thiago Ribeiro

     

    • T_N's avatar
      T_N
      Level 6

      I moved that backup to another netbackup appliance, it's still running. I think it may be deduplication jobs running that cause the error. Veritas support " The Transparent Huge Pages feature is enabled by default in RHEL/CentOS 6 or 7. The kernel will always attempt to satisfy a high-order memory allocation using hugepages. If no hugepages are available, the kernel will try to defrag memory to get hugepages. This defrag effort is time-consuming when system is under memory pressure and will cause high latency to user-land processes.". He asked me to :

      Disable THP without rebooting & Disable THP at boot time by adding transparent_hugepage=never to the end of the kernel line and rebooting . I will reboot that appliance on next thursday and let see what happens :)

    • T_N's avatar
      T_N
      Level 6

      I had 2 backup jobs completed successfully and it fails until now. Yes I have more space on MSDP (129 TB free spaces)