Jump to content

Recommended Posts

Posted

Linux, Nagios + немного роутинга. Несколько раз в сутки в логи вываливается следующее:

 

hdc: status timeout: status=0xd0 { Busy }
ide: failed opcode was: unknown
hdc: no DRQ after issuing MULTWRITE_EXT
ide1: reset: success

 

Кто расшифрует? Это начало конца или винт просто не вывозит кол-во обращений (из-за нагиоса например)?

Posted (edited)

smartctl -H  /dev/hdc
smartctl version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

 

smartctl -A  /dev/hdc
smartctl version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x000f   075   063   044    Pre-fail  Always       -       155568880
 3 Spin_Up_Time            0x0003   099   099   000    Pre-fail  Always       -       0
 4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       24
 5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
 7 Seek_Error_Rate         0x000f   082   060   030    Pre-fail  Always       -       198862163
 9 Power_On_Hours          0x0032   076   076   000    Old_age   Always       -       21818
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   037   020    Old_age   Always       -       24
184 Unknown_Attribute       0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Unknown_Attribute       0x0032   100   056   000    Old_age   Always       -       4295035960
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   073   064   045    Old_age   Always       -       27 (Lifetime Min/Max 10/36)
194 Temperature_Celsius     0x0022   027   040   000    Old_age   Always       -       27 (0 10 0 0)
195 Hardware_ECC_Recovered  0x001a   039   031   000    Old_age   Always       -       155568880
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0

 

Полный вывод:

 

smartctl -a /dev/hdc

smartctl version 5.38 [i686-redhat-linux-gnu] Copyright © 2002-8 Bruce Allen

Home page is http://smartmontools.sourceforge.net/

 

=== START OF INFORMATION SECTION ===

Device Model: ST3250310NS

Serial Number: 9SF0Q7LK

Firmware Version: SN06

User Capacity: 250,059,350,016 bytes

Device is: Not in smartctl database [for details use: -P showall]

ATA Version is: 8

ATA Standard is: ATA-8-ACS revision 4

Local Time is: Mon Apr 23 13:14:44 2012 NOVT

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status: (0x82) Offline data collection activity

was completed without error.

Auto Offline Data Collection: Enabled.

Self-test execution status: ( 0) The previous self-test routine completed

without error or no self-test has ever

been run.

Total time to complete Offline

data collection: ( 634) seconds.

Offline data collection

capabilities: (0x7b) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities: (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability: (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: ( 1) minutes.

Extended self-test routine

recommended polling time: ( 60) minutes.

Conveyance self-test routine

recommended polling time: ( 2) minutes.

SCT capabilities: (0x103d) SCT Status supported.

SCT Feature Control supported.

SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 10

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

1 Raw_Read_Error_Rate 0x000f 075 063 044 Pre-fail Always - 155568880

3 Spin_Up_Time 0x0003 099 099 000 Pre-fail Always - 0

4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 24

5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0

7 Seek_Error_Rate 0x000f 082 060 030 Pre-fail Always - 198863060

9 Power_On_Hours 0x0032 076 076 000 Old_age Always - 21818

10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0

12 Power_Cycle_Count 0x0032 100 037 020 Old_age Always - 24

184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0

187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0

188 Unknown_Attribute 0x0032 100 056 000 Old_age Always - 4295035960

189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0

190 Airflow_Temperature_Cel 0x0022 074 064 045 Old_age Always - 26 (Lifetime Min/Max 10/36)

194 Temperature_Celsius 0x0022 026 040 000 Old_age Always - 26 (0 10 0 0)

195 Hardware_ECC_Recovered 0x001a 039 031 000 Old_age Always - 155568880

197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0

198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0

199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

# 1 Short offline Completed without error 00% 18112 -

# 2 Short offline Completed without error 00% 18088 -

# 3 Short offline Completed without error 00% 18064 -

# 4 Short offline Completed without error 00% 18040 -

# 5 Short offline Completed without error 00% 18016 -

# 6 Extended offline Interrupted (host reset) 20% 17993 -

# 7 Short offline Completed without error 00% 17992 -

# 8 Short offline Completed without error 00% 17968 -

# 9 Short offline Completed without error 00% 17944 -

#10 Short offline Completed without error 00% 17920 -

#11 Short offline Completed without error 00% 17896 -

#12 Short offline Completed without error 00% 17872 -

#13 Short offline Completed without error 00% 17848 -

#14 Extended offline Completed without error 00% 17826 -

#15 Short offline Completed without error 00% 17824 -

#16 Short offline Completed without error 00% 17800 -

#17 Short offline Completed without error 00% 17776 -

#18 Short offline Completed without error 00% 17752 -

#19 Short offline Completed without error 00% 17728 -

#20 Short offline Completed without error 00% 17704 -

#21 Short offline Completed without error 00% 17680 -

 

SMART Selective self-test log data structure revision number 1

SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS

1 0 0 Not_testing

2 0 0 Not_testing

3 0 0 Not_testing

4 0 0 Not_testing

5 0 0 Not_testing

Selective self-test flags (0x0):

After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

 

 

Edited by kaktak
  • 2 weeks later...
Posted

1 Raw_Read_Error_Rate     0x000f   075   063   044    Pre-fail  Always       -       155568880

7 Seek_Error_Rate         0x000f   082   060   030    Pre-fail  Always       -       198862163

195 Hardware_ECC_Recovered  0x001a   039   031   000    Old_age   Always       -       155568880

ЗачОтное корыто.

Posted

Если винт марки seagate, то большие значения первых двух параметров обычное явление.

 

угу, есть такое наблюдение. а что скажите насчет остального?

Posted

По Hardware_ECC_Recovered: возможно sata кабель скручен в пружинку каким-нибудь умником.

 

PS

В вашем случае я бы поменял и винт и sata кабель и возможно переходник питания (если таковые имеются и винт не в корзине, хотя корзина тоже не безгрешна), т.к. высок риск потерять данные, пока будете возиться с поиском причины, которую придётся исправлять тем-же способом.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...
На сайте используются файлы cookie и сервисы аналитики для корректной работы форума и улучшения качества обслуживания. Продолжая использовать сайт, вы соглашаетесь с использованием файлов cookie и с Политикой конфиденциальности.